A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

173
A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO DEMAND RESPONSE MANAGEMENT FOR THE SMART GRID A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Engineering and Physical Sciences 2015 By Fanlin Meng School of Computer Science

Transcript of A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Page 1: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

A GAME-THEORETIC AND

MACHINE-LEARNING APPROACH

TO DEMAND RESPONSE

MANAGEMENT FOR THE SMART

GRID

A thesis submitted to the University of Manchester

for the degree of Doctor of Philosophy

in the Faculty of Engineering and Physical Sciences

2015

By

Fanlin Meng

School of Computer Science

Page 2: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Contents

List of Abbreviations 9

List of Symbols 11

Abstract 13

Declaration 14

Copyright 15

Publications 16

Acknowledgements 18

1 Introduction 19

1.1 Context and Motivation . . . . . . . . . . . . . . . . . . . . . . . 19

1.2 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2 Background and Related Work 27

2.1 Smart Grid and Demand Response . . . . . . . . . . . . . . . . . 27

2.1.1 Smart Grid . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.1.2 Two-way Communication Infrastructure . . . . . . . . . . 28

2.1.3 Non-intrusive Load Monitoring . . . . . . . . . . . . . . . 29

2.1.4 Demand Response . . . . . . . . . . . . . . . . . . . . . . 30

2.2 Demand Response – Related Work . . . . . . . . . . . . . . . . . 35

2.2.1 Customer Demand Modelling . . . . . . . . . . . . . . . . 35

2

Page 3: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

2.2.2 Smart Pricing Design for the Retailer . . . . . . . . . . . . 43

2.3 Critical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

2.3.1 Utility Function and Home Energy Management Based De-

mand Modelling . . . . . . . . . . . . . . . . . . . . . . . . 55

2.3.2 Customer Behaviour Learning Based Demand Modelling . 56

2.3.3 Optimization and Game-theory based Smart Pricing Design 56

2.3.4 Smart Pricing Computation . . . . . . . . . . . . . . . . . 57

2.3.5 Customer Behaviour Learning based Smart Pricing Design 58

2.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3 Smart Pricing to Demand Response I 59

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.3 Stackelberg Game Model Formulation . . . . . . . . . . . . . . . . 62

3.3.1 Energy Usage Scheduling for Customers – Follower Level . 62

3.3.2 Profit Maximization Model for the Retailer – Leader Level 66

3.3.3 A Two Stage Stackelberg Game Model . . . . . . . . . . . 67

3.4 Stackelberg Game Model Solutions . . . . . . . . . . . . . . . . . 67

3.4.1 Existence of Stackelberg Strategy . . . . . . . . . . . . . . 67

3.4.2 Problem Transformation and Solutions . . . . . . . . . . . 70

3.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.5.1 Benefits to Customers and Energy Retailer Based on Public

Dynamic Electricity Prices . . . . . . . . . . . . . . . . . . 74

3.5.2 Benefits to Customers and Energy Retailer Based on Our

Proposed Optimal Smart Pricing Scheme . . . . . . . . . . 80

3.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4 Smart Pricing to Demand Response II 86

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.2 Preliminary Knowledge – Genetic Algorithms . . . . . . . . . . . 88

4.2.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . 88

4.2.2 Tournament Selection and Elitism . . . . . . . . . . . . . . 89

4.2.3 Uniform Crossover . . . . . . . . . . . . . . . . . . . . . . 90

4.2.4 Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.2.5 Constraint Handling . . . . . . . . . . . . . . . . . . . . . 91

4.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 92

3

Page 4: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

4.4 Bilevel Problem Formulation . . . . . . . . . . . . . . . . . . . . . 94

4.4.1 Customer-side Problem at the Lower Level . . . . . . . . . 94

4.4.2 Retailer-side Problem at the Upper Level . . . . . . . . . . 101

4.5 Bilevel Model Solutions . . . . . . . . . . . . . . . . . . . . . . . . 102

4.5.1 Existence of Optimal Solutions to the Bilevel Model . . . . 103

4.5.2 Solutions to the Lower-level Problem . . . . . . . . . . . . 106

4.5.3 Distributed Optimization Algorithms to the Upper Level

Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

4.5.4 Benefits of the Proposed Distributed Optimization Algo-

rithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

4.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

4.6.1 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . 111

4.6.2 Benefits to the Retailer . . . . . . . . . . . . . . . . . . . . 113

4.6.3 Benefits to Customers . . . . . . . . . . . . . . . . . . . . 114

4.7 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 115

5 Smart Pricing to Demand Response III 116

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5.2 Preliminary Knowledge – Machine Learning . . . . . . . . . . . . 117

5.2.1 Conditional Probability and Bayes’ Theorem . . . . . . . . 117

5.2.2 Bayesian Inference and Updating [BT11] . . . . . . . . . . 118

5.2.3 Linear Regression Analysis . . . . . . . . . . . . . . . . . . 119

5.2.4 Recursive Identification . . . . . . . . . . . . . . . . . . . . 120

5.2.5 Recursive Least Square with Forgetting Factor [Lju99] . . 123

5.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 123

5.4 Customer Behaviour Learning Models . . . . . . . . . . . . . . . . 125

5.4.1 Shiftable Appliances . . . . . . . . . . . . . . . . . . . . . 125

5.4.2 Curtailable Appliances . . . . . . . . . . . . . . . . . . . . 137

5.5 Pricing Optimization for Demand Response Management . . . . . 139

5.5.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

5.5.2 Pricing Optimization – Problem Formulation . . . . . . . . 140

5.5.3 Solution Algorithm . . . . . . . . . . . . . . . . . . . . . . 141

5.5.4 Benefits to the Retailer and its Customers . . . . . . . . . 143

5.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

5.6.1 Learning Algorithms Evaluation . . . . . . . . . . . . . . . 144

5.6.2 Pricing Optimization . . . . . . . . . . . . . . . . . . . . . 150

4

Page 5: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

5.7 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 155

6 Conclusions and Future Work 156

6.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . 157

6.1.1 Stackelberg Game and Bilevel Optimization based Demand

Response Management . . . . . . . . . . . . . . . . . . . . 157

6.1.2 Learning based Demand Response Management . . . . . . 158

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Bibliography 162

Word Count: 33336

5

Page 6: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

List of Tables

2.1 A Brief Comparison between the Current Grid and the Smart Grid 28

2.2 Summary Table of Smart Pricing based Demand Response . . . . 54

3.1 Shiftable Appliances’ parameters for each home . . . . . . . . . . 73

3.2 Non-shiftable Appliances’ parameters for each home . . . . . . . . 73

3.3 Parameters of curtailable appliances for each sensitive user . . . . 73

3.4 Parameters of curtailable appliances for each mid-sensitive user . 73

3.5 Parameters of curtailable appliances for each insensitive user . . . 73

3.6 Typical consumption level for each category of appliances . . . . . 74

3.7 Average Daily Bill comparison of each type of user over one month 80

3.8 Average PAR in daily load for each type of user over one month . 80

3.9 Combinations of users for case study . . . . . . . . . . . . . . . . 80

3.10 Bill comparison of each type of user under Case 1 . . . . . . . . . 81

3.11 Bill comparison of each type of user under Case 2 . . . . . . . . . 81

3.12 Bill comparison of each type of user under Case 3 . . . . . . . . . 81

4.1 Energy bills saved by different waiting length . . . . . . . . . . . . 97

4.2 Financial thresholds of waiting length . . . . . . . . . . . . . . . . 111

4.3 Parameter settings of the multi-population GA . . . . . . . . . . 113

4.4 Revenue, cost and profit under different pricing strategies . . . . . 114

5.1 Historical data about dish washer usage and prices . . . . . . . . 127

5.2 Historical data about PHEV usage and prices . . . . . . . . . . . 128

5.3 Parameters for each interruptible appliance . . . . . . . . . . . . . 146

5.4 Parameters for each non-interruptible appliance . . . . . . . . . . 146

5.5 Error Measurements of Learning . . . . . . . . . . . . . . . . . . 148

5.6 Parameter Settings for Pricing Optimization . . . . . . . . . . . . 151

5.7 Parameter settings of GA . . . . . . . . . . . . . . . . . . . . . . 151

6

Page 7: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

List of Figures

1.1 Electricity consumption of different sectors from year 1990 to 2013

in UK. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.1 General Framework of a NILM system . . . . . . . . . . . . . . . 29

2.2 Categories of Demand Response Programs . . . . . . . . . . . . . 31

2.3 Time-of-Use Pricing (Economy 7, UK) . . . . . . . . . . . . . . . 33

2.4 Example of Critical Peak Pricing . . . . . . . . . . . . . . . . . . 33

2.5 Example of Real-time Pricing . . . . . . . . . . . . . . . . . . . . 34

2.6 Sample utility functions and marginal benefit function for cus-

tomers under U1(x, ω) and α = 0.3. . . . . . . . . . . . . . . . . . 38

2.7 Sample utility functions and marginal benefit function for cus-

tomers under under U2(x, ω) and β = 1. . . . . . . . . . . . . . . 38

3.1 Structure of a Residential Power Network . . . . . . . . . . . . . . 60

3.2 Payment bills without curtailable appliances over one month . . . 76

3.3 PAR without curtailable appliances over one month . . . . . . . . 76

3.4 Payment with curtailable appliances over one month – sensitive user 77

3.5 PAR with curtailable appliances over one month – sensitive user . 77

3.6 Payment with curtailable appliances over one month – mid-sensitive

user . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

3.7 PAR with curtailable appliances over one month – mid-sensitive user 78

3.8 Payment with curtailable appliances over one month – insensitive

user . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

3.9 PAR with curtailable appliances over one month – insensitive user 79

3.10 Energy consumption of different users under Case 1 . . . . . . . . 83

3.11 Energy consumption of different users under Case 2 . . . . . . . . 84

3.12 Energy consumption of different users under Case 3 . . . . . . . . 85

4.1 Flowchart of a Typical Genetic Algorithm . . . . . . . . . . . . . 88

7

Page 8: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

4.2 Bilevel programming model structure . . . . . . . . . . . . . . . . 93

4.3 Convergence speed of the multi-population GA and the simple GA 112

4.4 Convergence of the multi-population GA and the simple GA under

different customer numbers . . . . . . . . . . . . . . . . . . . . . . 112

4.5 Obtained optimal day-ahead prices and flat prices . . . . . . . . . 114

4.6 Daily electricity payment of one customer over one month . . . . 115

5.1 Learning results under model [MZ13] and [CKS11]. . . . . . . . . 146

5.2 Fuzzy membership functions of space heater. . . . . . . . . . . . . 148

5.3 Estimated price elasticities of demand from 8PM to 12AM. . . . . 149

5.4 Actual demand and forecast demand of space heater from 8PM to

12AM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

5.5 Residuals of the forecasting of space heater from 8PM to 12AM. . 150

5.6 Convergence speed of the proposed genetic algorithm. . . . . . . . 152

5.7 Comparison between optimized prices and original prices. . . . . . 153

5.8 Energy consumption under optimized prices and original prices. . 153

5.9 Profit and revenue under optimized prices and original prices. . . 154

5.10 Profits and revenues under optimized prices and original prices

over one week. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

8

Page 9: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

List of Abbreviations

DR Demand Response

HEMS Home Energy Management System

SG Smart Grid

DLC Direct Load Control

PHEV Plug-in Hybrid Vehicles

KKT Karush-Kuhn-Tucker

GA Genetic Algorithms

NILM Non-intrusive Load Monitoring

AMI Advanced Metering Infrastructure

ToU Time-of-use Pricing

CPP Critical-Peak Pricing

RTP Real-time Pricing

DAP Day-ahead Pricing

PAR Peak-to-average Ratio

NE Nash Equilibrium

MILP Mixed Integer Linear Programming

MIQP Mixed Integer Quadratic Programming

PV Photovoltaic

9

Page 10: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

EVs Electric Vehicles

V2G Vehicle-to-grid

V2H Vehicle-to-home

10

Page 11: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

List of Symbols

The main notation used throughout the thesis is stated below for quick reference.

Other symbols are defined as required throughout the text.

n n-th customer

JFn Pay-off function of n-th follower(customer)

JL Pay-off function of the leader(retailer)

N Set of customers

N Number of customers in set N

s Shiftable appliance s

c Curtailable appliance c.

Sn Set of shiftable appliances for customer n

Cn Set of curtailable appliances for customer n

Es The total energy consumption for appliance s to finish

γmaxs Maximum hourly energy consumption of shiftable appliance s

γmins Minimum hourly energy consumption of shiftable appliance s

h Hour h

d Day d

H Scheduling window for home energy management scheduling

and retail pricing optimization

H Scheduling horizon for home energy management scheduling

11

Page 12: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

and retail pricing optimization, i.e. H = |H|

Hs Scheduling window of shiftable appliance s

Hc Scheduling window of curtailable appliance c

P si (d) The probability that a customer uses appliance s at the i-th

cheapest period based on the data up to day d

δi(d+ 1) Probability that appliance s is on at the i-th cheapest period

ph Electricity price offered by the retailer at hour h

pminh Minimum price that the retailer can offer at hour h

pmaxh Maximum price that the retailer can offer at hour h

Emaxh Maximum electricity supply capacity of the retailer at hour h

Rmax Maximum daily revenue of the retailer

β Parameters that need to be identified in the learning model for

curtailable appliances

12

Page 13: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Abstract

A Game-theoretic and Machine-learning Approach toDemand Response Management for the Smart Grid

Fanlin MengA thesis submitted to the University of Manchester

for the degree of Doctor of Philosophy, 2015

Demand Response (DR) was proposed more than a decade ago to incentivisecustomers to shift their electricity usage from peak demand periods to off-peakdemand periods and to curtail their electricity usage during peak demand pe-riods. However, the lack of two-way communication infrastructure weakens theinfluence of DR and limits its applications. With the development of smart gridfacilities (e.g. smart meters and the two-way communication infrastructure) thatenable the interactions between the energy retailer and its customers, demandresponse shows great potential to reduce customers’ bills, increase the retailer’sprofit and further stabilize the power systems. Given such a context, in this thesiswe propose smart pricing based demand response programs to study the inter-actions between the energy retailer and its customers based on game-theory andmachine learning techniques. We conduct the research in two different applica-tion scenarios: 1) For customers with home energy management system (HEMS)installed in their smart meters, the retailer will know the customers’ energy con-sumption patterns by interacting with the HEMS. As a result, the smart pricingbased demand response problem can be modelled as a Stackelberg game or bileveloptimization problem. Further, efficient solutions are proposed for the demandresponse problems and the existence of optimal solution to the Stackelberg gameand the bilevel model is proved; 2) For customers without HEMS installed in theirsmart meters, the retailer will not know the energy consumption patterns of thesecustomers and must learn customers’ behaviour patterns via historical energy us-age data. To realize this, two appliance-level machine learning algorithms areproposed to learn customers’ consumption patterns. Further, distributed pricingalgorithms are proposed for the retailer to solve the demand response problemeffectively. Simulation results indicate the effectiveness of the proposed demandresponse models in both application scenarios.

13

Page 14: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Declaration

No portion of the work referred to in this thesis has been

submitted in support of an application for another degree

or qualification of this or any other university or other

institute of learning.

14

Page 15: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Copyright

i. The author of this thesis (including any appendices and/or schedules to

this thesis) owns certain copyright or related rights in it (the “Copyright”)

and s/he has given The University of Manchester certain rights to use such

Copyright, including for administrative purposes.

ii. Copies of this thesis, either in full or in extracts and whether in hard or

electronic copy, may be made only in accordance with the Copyright, De-

signs and Patents Act 1988 (as amended) and regulations issued under it

or, where appropriate, in accordance with licensing agreements which the

University has from time to time. This page must form part of any such

copies made.

iii. The ownership of certain Copyright, patents, designs, trade marks and other

intellectual property (the “Intellectual Property”) and any reproductions of

copyright works in the thesis, for example graphs and tables (“Reproduc-

tions”), which may be described in this thesis, may not be owned by the

author and may be owned by third parties. Such Intellectual Property and

Reproductions cannot and must not be made available for use without the

prior written permission of the owner(s) of the relevant Intellectual Property

and/or Reproductions.

iv. Further information on the conditions under which disclosure, publication

and commercialisation of this thesis, the Copyright and any Intellectual

Property and/or Reproductions described in it may take place is available

in the University IP Policy (see http://documents.manchester.ac.uk/

DocuInfo.aspx?DocID=487), in any relevant Thesis restriction declarations

deposited in the University Library, The University Library’s regulations

(see http://www.manchester.ac.uk/library/aboutus/regulations) and

in The University’s policy on presentation of Theses

15

Page 16: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Publications

The following papers have been produced during the research of this project,

which are closely related to each thesis chapter.

Journal Papers

• Fan-Lin Meng and Xiao-Jun Zeng. A stackelberg game-theoretic approach

to optimal real-time pricing for the smart grid. Soft Computing, 17(12):2365–

2380, 2013

• Fan-Lin Meng and Xiao-Jun Zeng. A profit maximization approach to

demand response management with customers behaviour learning in smart

grid. IEEE Transactions on Smart Grid, 2015 (In press)

• Fan-Lin Meng and Xiao-Jun Zeng. A hybrid optimization approach to

demand response management for the smart grid. IEEE Transactions on

Power Systems, 2015 (in Review Process)

Conference Proceedings

• Fan-Lin Meng and Xiao-Jun Zeng. A stackelberg game approach to max-

imise electricity retailer’s profit and minimse customers’ bills for future

smart grid. In Computational Intelligence (UKCI), 2012 12th UK Work-

shop on, pages 1–7. IEEE, 2012

• Fan-Lin Meng, Xiao-Jun Zeng, and Qian Ma. Learning customer behaviour

under real-time pricing in the smart grid. In Systems, Man, and Cybernetics

(SMC), 2013 IEEE International Conference on, pages 3186–3191. IEEE,

2013

• Fan-Lin Meng and Xiao-Jun Zeng. An optimal real-time pricing for demand-

side management: A stackelberg game and genetic algorithm approach. In

16

Page 17: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Neural Networks (IJCNN), 2014 International Joint Conference on, pages

1703–1710. IEEE, 2014

• Fan-Lin Meng and Xiao-Jun Zeng. Appliance level demand modeling and

pricing optimization for demand response management in smart grid. In

Neural Networks (IJCNN), 2015 International Joint Conference on, 2015

(In press)

17

Page 18: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Acknowledgements

Firstly, I would like to thank my supervisor Dr. Xiao-Jun Zeng for his continuous

guidance, encouragement and help across my PhD study.

Many thanks to my wife Dan Chen, my dad Qingchen Meng and my mother

Suqin Tian who are always supporting me and encouraging me in finishing this

PhD.

I also want to thank Mingjie Zhao and Hassan A. Bashir for their help during

the early stage of my PhD study. Special thanks to Richard Mealing and Darren

Hau for their company in playing table tennis every Friday. I would also like to

thank all the MLO group members for their help during the completion of this

PhD project.

18

Page 19: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Chapter 1

Introduction

1.1 Context and Motivation

Since most electricity grids in the world were designed decades ago (e.g. the

latest upgrade of the UK electricity grid was in 1965 [Gri]), some components

in these grids have nearly reached the end of their normal life spans. However,

the demand of electricity never stops increasing. According to a public statistic

dataset, the demand and consumption of electricity in the UK has increased by

17.34% from 1990 to 2013 [oECC13].

Among all the electricity consumption in the UK, residential consumption

accounts for around 35% on average of the total consumption for the last 24

years (1990 - 2013), which can be seen in Figure 1.1.

Although it accounts for more than one third of the total electricity con-

sumption, little attention has been paid to improve the energy efficiency in the

residential sector. End-use customers being charged at flat electricity prices and

the lack of two-way communication infrastructure between the energy retailer and

its customers are two main barriers to the development of energy management

and energy efficiency programs in the residential sector.

In general, the existing electricity grid in the residential sector has the follow-

ing limitations:

• Firstly, there is only limited communication between electricity suppliers

(retailers) and customers. The existing communication infrastructure does not

allow interactions between electricity suppliers and their customers and it ignores

the important role of customers in energy management.

• Secondly, the traditional electricity meter installed in customers’ houses

19

Page 20: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 1. INTRODUCTION 20

1990 1993 1996 1999 2002 2005 2008 20110

50

100

150

200

250

300

350

Year

Ele

ctric

ity c

onsu

mpt

ion

(TW

h)

Industry sectorResidential sectorOtherTotal consumption

Figure 1.1: Electricity consumption of different sectors from year 1990 to 2013 inUK.Source: data used in this Figure can be found at [oECC13]

Page 21: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 1. INTRODUCTION 21

only measures accumulated power consumption. It cannot provide any useful

information to the customers to help or assist them in managing their energy

usage.

• Thirdly, for most customers, they are billed according to a flat retail elec-

tricity price. In the flat pricing scheme, customers have no incentives to shift

their electricity loads from peak-demand periods to off-peak demand periods.

Based on the above analysis, a more intelligent and reliable grid with two-way

communication is urgently required.

Smart grid (SG), which is also known as an electricity power system with

advanced, intelligent and automated communication and control techniques in

its generation, transmission, distribution and consumption processes [FMXY12],

can overcome the limitations existing in the current electricity grid.

With the advent and development of smart grids, the influence of various

energy management programs (e.g. energy conservation and energy efficiency

programs, fuel substitution programs and demand response programs) has been

enlarged and promoted. In general, these programs aim to reduce consumption

in peak demand periods or shift consumption from peak demand periods to off-

peak demand periods. Among these programs, demand response (DR) attracts

much attention due to its ability to incentivise customers to change their energy

consumption patterns.

According to a report of the U.S. Department of Energy, demand response is

defined as follows [oE06]:

“Demand response is a tariff or program established to motivate changes in

electric use by end-use customers in response to changes in the price of electricity

over time, or to give incentive payments designed to induce lower electricity use

at times of high market prices or when grid reliability is jeopardized.”

There are two widely used demand response strategies in the residential sec-

tor, i.e. direct load control (DLC) and smart pricing. DLC is an approach for

residential load management in which the utility company (retailer) can remotely

control the operations of certain appliances in a household based on an agreement

with the customers [Com12]. To implement the DLC, it needs consumers to pro-

vide their usage information about the appliances to the utility company, which

Page 22: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 1. INTRODUCTION 22

will cause privacy problems to the customers. Instead of DLC, smart pricing (dy-

namic pricing) is an alternative approach receiving much attention from industry

and academia in recent years. Smart pricing incentivises electricity customers

to lower their usage during peak demand times, or shift their electricity usage

from peak demand times to off-peak demand times. However, there is one major

barrier to fully utilizing smart pricing, i.e. ordinary customers have no knowledge

on how to respond to these price signals. As a result, designing an automated

energy management system for customers is an important part of smart pricing

programs. Further, how to design efficient smart pricing strategies by taking into

account customers’ potential responses is another important issue faced by the

retailer and is also an important part of smart pricing programs. The efficiency of

the smart pricing strategies can be defined as follows: 1) it can lower the retailer’s

energy procurement cost and lower customers’ bills. 2) it can balance the energy

demand and supply by shifting the energy from peak-demand time to off-peak

demand time and therefore flatten the energy demand curve to avoid or slow the

update of existing power network infrastructure.

1.2 Research Questions

The research problem that we are looking at in this thesis focuses on smart pricing

based demand response management for the smart grid. As an efficient smart

pricing strategy always considers interactions between the energy retailer and

its customers, there are two different application scenarios when considering the

potential responses of customers.

The first scenario happens when customers have home energy management

systems (HEMS) installed in their smart meters. The related research questions

are: 1) how can we design an efficient home energy management system acting

on behalf of the customers to optimally respond to the smart prices? As dif-

ferent customers have different preferences in using energy, how to model each

customer’s preference into a unified HEMS framework is very challenging. 2) how

can we design an efficient smart pricing strategy for the retailer by taking into

account the customers’ potential responses? The second question is challenging

because we have to incorporate all the customers’ responses into a game frame-

work, which is a large-scale problem. Further we have to show that the proposed

smart pricing strategy can produce win-win outcomes for both the retailer and

Page 23: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 1. INTRODUCTION 23

its customers.

The second scenario happens when customers’ smart meters are not installed

with HEMS. In this case, the retailer needs to learn customers’ energy consump-

tion patterns in order to design efficient and accurate smart pricing strategies.

Further research questions related to the second scenario are: 3) how can we

design an efficient learning system for the retailer to learn customers’ energy con-

sumption patterns? This is challenging because there are no existing literature

for reference and we need to propose original models and methodologies to solve

this problem. 4) how can we design an efficient smart pricing strategy for the

retailer based on the learning systems? This is challenging because we have to

incorporate all the customers’ behaviour learning models into our smart pricing

framework to produce win-win outcomes for the retailer and the customers.

1.3 Research Objectives

The overall objective of this thesis is to develop smart pricing based demand

response management models for a residential power network to achieve win-

win strategies for both the retailer and its customers. On the one hand, for the

retailer, by adopting the smart pricing strategies, it can incentivize the customers

to shift their energy usage from peak-demand period to off-peak demand period.

As a result, the customers’ demand curve will become more flat and balanced and

therefore the retailer can reduce the expensive peak-time energy procurement and

lower the energy cost. On the other hand, for customers, when faced with smart

pricing, they can shift their energy usage from high price period to low price

period to reduce their energy bills.

More specifically, the research objectives are attempting to answer the four

research questions raised in Section 1.2.

• To answer question 1), we aim to design an efficient and comprehensive

home energy management system in the context of smart grids to manage the

energy usage of home appliances that consist of shiftable appliances and curtail-

able appliances.

• To answer question 2), we aim to design an efficient and applicable smart

pricing strategy for demand response management by taking into account cus-

tomers’ potential responses. This can be realized by adopting a Stackelberg game

Page 24: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 1. INTRODUCTION 24

or bilevel optimization model to model the interactions between the energy re-

tailer and its customers.

• To answer questions 3) and 4), we aim to design two-appliance level machine

learning algorithms to learn customers’ behaviour patterns in using shiftable ap-

pliances and curtailable appliances. Further, an efficient smart pricing strategy

is designed based on the learning results.

1.4 Contributions

This section outlines contributions that have been made in this thesis, which are

discussed from the home-level perspective (i.e. home energy management problem

and customer behaviour learning) and the retailer-level perspective (smart pricing

design).

At the home-level, firstly, for customers who have home energy management

systems (HEMS) installed in their smart meters, our proposed HEMS framework

models all possible categories of home appliances including shiftable appliances

(both interruptible appliances such as plug-in hybrid electric vehicles (PHEV) and

non-interruptible appliances such as washing machines and dish washers) and cur-

tailable appliances such as air-conditioning and space heaters. For interruptible

and non-interruptible appliances, our approach further proposes a realistic and

user-friendly waiting time cost model which can be set up easily by an ordinary

customer. For curtailable appliances, which are not considered or modelled in

the current literature, our approach considers them and further proposes possible

types of applications.

Secondly, for customers without HEMS installed in their smart meters, the

retailer does not know customers’ energy consumption patterns and needs to

learn them via historical energy usage data. To achieve this, we propose two

appliance-level customer behaviour learning algorithms to learn the customers’

energy consumption patterns for the retailer. For shiftable appliances whose en-

ergy consumption can be shifted from high price (peak-demand) periods to low

price (off-peak demand) periods where the total energy consumption is fixed, a

probabilistic behaviour model and its learning algorithm are proposed to model an

individual customer’s shifting probabilities dependent on different hourly prices.

For curtailable appliances whose energy consumption cannot be shifted where

the total energy consumption can be adjusted, a regression model is proposed to

Page 25: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 1. INTRODUCTION 25

model an individual customer’s usage patterns dependent on prices and temper-

atures.

Due to the existence of two scenarios at the customer-side problem, at the

retailer-level, the smart pricing design problem that takes into account customers’

responses has two different situations. For customers who have HEMS installed

in their smart meters, the pricing determination problem faced by the retailer

can be seen as a Stackelberg game or bilevel optimization problem. In this case,

we propose a Karush-Kuhn-Tucker (KKT) condition based approach to solve

the proposed Stackelberg game model effectively for small-scale demand response

problems as well as multi-population genetic algorithms based distributed algo-

rithms to solve the proposed bilevel optimization model effectively for large-scale

demand response problems. Further, the existence of the optimal solution to the

proposed Stackelberg game model and the bilevel optimization model is proved

to ensure that the proposed approaches are built on a sound theoretic foundation.

For customers who do not have HEMS installed, the pricing determination prob-

lem faced by the retailer is dependent on the customer behaviour learning results.

In this case, we propose genetic algorithms (GA) based distributed algorithms to

solve the smart pricing problems effectively.

1.5 Thesis Organization

The rest of this thesis is organized as follows. Chapter 2 firstly introduces

the background of smart grids and demand response. Secondly, we investigate

the state-of-the-art of demand response management from the customer demand

modelling perspective and the smart pricing design perspective. Thirdly, a critical

analysis of the related demand response work is given.

In Chapter 3, we propose a Stackelberg game to model the interactions be-

tween the retailer and its customers, i.e. the retailer determines the retail prices

(smart pricing) by taking into account the customers’ potential responses. The

home energy management problem at the customer-side is modelled to consider

most commonly used types of home appliances such as shiftable appliances and

curtailable appliances. Further, the Stackelberg game that models the smart

pricing based demand response management problem is solved via the Karush-

Kuhn-Tucker (KKT) condition based approach, which is very effective in solving

small-scale problems.

Page 26: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 1. INTRODUCTION 26

In Chapter 4, instead of using a Stackelberg game, we propose a bilevel op-

timization model to represent the interactions between the retailer and its cus-

tomers. At the customer-side, a comprehensive and complete home energy man-

agement system including most commonly used types of appliances and possible

applications is proposed. Furthermore, we propose multi-population genetic al-

gorithms based distributed optimization algorithms, which can be used in the

large-scale problems, to solve the proposed bilevel model.

While Chapters 4 and 5 assume that customers are installed with HEMS, in

Chapter 5, we propose two appliance-level customer behaviour learning models

that can learn customers’ electricity consumption patterns in using shiftable ap-

pliances and curtailable appliances form historical usage data. Thereafter the

smart pricing based demand response problem is solved by genetic algorithms

based distributed algorithms.

This thesis is concluded and the future work is given in Chapter 6

Page 27: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Chapter 2

Background and Related Work

In this chapter, the background and related work are given to put this thesis into

context. Firstly, the smart grid and demand response related concepts used in this

thesis such as two-way communication, non-intrusive load monitoring (NILM)

and demand response are presented in Section 2.1. Secondly, the related work

of demand response is given in Section 2.2. More specifically, the related work

of customer demand modelling is given in subsection 2.2.1 and the related work

of smart pricing design is given in subsection 2.2.2. Thirdly, the critical analysis

of the existing related work is given in Section 2.3. This chapter is concluded in

Section 2.4.

2.1 Smart Grid and Demand Response

In this section, we firstly introduce some background concepts such as the smart

grid, two-way communication and non-intrusive load monitoring (NILM). Sec-

ondly, the concepts of demand response are given.

2.1.1 Smart Grid

Compared with the traditional power grid, which is used to carry power from

central generators to a large number of customers, the smart grid is a digitally

enabled grid with an intelligent communication infrastructure and uses two-way

flows of electricity and information to create an automated and distributed ad-

vanced energy delivery network [FMXY12].

27

Page 28: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 28

Table 2.1: A Brief Comparison between the Current Grid and the Smart Grid

Existing Grid Smart GridElectromechanical Digital

One-way communication Two-way communicationCentralized generation Distributed generation

Few sensors Sensors throughoutManual monitoring Self-monitoringManual restoration Self-healing

Failures and blackouts Adaptive and islandingLimited control Pervasive control

Few customer choices Many customer choices

The smart grid can separate the power system into easily manageable micro-

grids, which consist of smart metering for real time pricing, micro-generation

and micro-storage devices [oECC09] [oE03]. Each micro-grid is connected to the

electricity generators, much like the current power system. But the difference

lies in that each micro-grid can work in the island mode and continue to balance

the supply and demand by managing the electricity flow around the decentralised

network [Mil11]. Table 2.1 gives a brief comparison between the current grid and

the smart grid [FMXY12].

2.1.2 Two-way Communication Infrastructure

A smart meter, which is part of the advanced metering infrastructure (AMI), can

record electric energy consumption in intervals of an hour or less and communicate

that information at least daily back to the utility company for monitoring and

billing purposes [Com12]. The smart meter plays an important role in construct-

ing the future smart electricity grid and is one important part of the two-way

communication infrastructure. In the UK, all households will be installed with

smart meters by 2020 [oEC14] as a part of the smart grid plans across the whole

country.

AMI is an integrated system of smart meters, communications networks, and

data management systems that enables two-way communication between utili-

ties and customers with the goal of providing utilities with real-time data about

customers’ power consumption and allowing customers to make choices about

energy usage based on the time-differentiated prices [SMA15]. AMI provides the

Page 29: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 29

Figure 2.1: General Framework of a NILM system

necessary communication and control foundations for implementing critical en-

ergy management services such as automatic meter reading, demand response,

and time-based pricing schemes [MPM10].

2.1.3 Non-intrusive Load Monitoring

Non-intrusive load monitoring (NILM), or nonintrusive appliance load monitoring

(NIALM), monitors what appliances are used as well as their individual energy

consumption by detailed analysis of the voltage and current going into the house.

Prior to NILM, the common techniques to gather appliance load data are to place

sensors on individual appliances, which is more expensive and an intrusion onto

the customer’s property. NILM has been developed as a low-cost alternative to

simplify the collection of energy consumption data by utilities [Har92].

To implement the NILM, one needs to install smart meters at higher aggre-

gation points in the building’s power distribution system (e.g. the main feed

for a residential unit) and extract useful information by carefully processing the

smart meter data. Although there are many available NILM methods, the basic

principles of NILM are similar. It firstly selects and characterizes the features or

signatures of a specific appliance. Secondly, one requires a hardware installation

that can detect the selected features. Finally, it detects and extracts the features

in the overall signal using specialized signal processing and machine learning al-

gorithms [ZR11] [BGM+11].

Page 30: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 30

The general framework of a NILM system is illustrated in Figure 2.1.

2.1.4 Demand Response

Among all the energy management programs such as conservation and energy

efficiency programs, fuel substitution programs and demand response programs,

demand response (DR) is one of the most promising strategies to help balance the

load and supply, reduce the peak demand and increase the grid reliability [Sia14].

DR can be seen as a set of activities to reduce or shift electricity usages to im-

prove electric grid reliability, manage electricity costs, and ensure that customers

receive signals that encourage load reduction during times when the electric grid

is near its capacity [Kil10].

In general, DR can be categorized into two different types: incentive-based

DR and time-based pricing DR [Com12]. Incentive-based DR refers to customers

getting financial rewards for non-DR periods by reducing electricity usage during

periods of system need or stress. Time-based pricing DR refers to reduction

or shifting in customers’ demand when they receive price rising signals [HP08].

Figure 2.2 shows the categories of demand response programs.

Incentive-based DR [Com12]

Direct Load Control (DLC) also known as direct control load manage-

ment, is a demand response activity by which the program sponsor (e.g. energy

retailer) remotely shuts down or cycles a customer’s electrical equipment (e.g. air

conditioner, water heater) on short notice.

Interruptible /Curtailable Service (I/C) refers to the programs in which

the customers receive tariffs or contracts that provide a rate discount or bill credit

for agreeing to change electricity consumption such as reducing load during sys-

tem contingencies.

Demand Bidding & Buy-Back (DB) refers to a program which allows

a demand resource in retail and wholesale markets to offer load reductions at a

price, or to identify how much load it is willing to curtail at a specific price.

Emergency Demand Response Program (EDRP) represents a de-

mand response program that provides incentive payments to customers for load

Page 31: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 31

Figure 2.2: Categories of Demand Response Programs

Page 32: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 32

reductions achieved during an Emergency Demand Response Event.

Capacity Market Programs (CMP) refer to arrangements in which cus-

tomers offer load reductions when system contingencies arise. Participating cus-

tomers typically receive notice of events requiring a load reduction and face penal-

ties when failing to curtail load. Incentives usually consist of up-front reservation

payments.

Ancillary Services (A/S) refer to services that ensure reliability and sup-

port the transmission of electricity to customer loads. Such services may include:

energy imbalance, operating reserves, contingency reserves, spinning reserves,

supplemental reserves, reactive supply and voltage control, and regulation and

frequency response.

Time-based Pricing DR

Time-based Pricing DR (also known as smart pricing) incentivise electricity cus-

tomers to lower their usage during peak times, or shift their electricity usage from

peak demand periods to off-peak demand periods.

There are various smart pricing schemes: Time-of-Use (ToU) Pricing, Critical-

Peak Pricing (CPP), Real-time Pricing (RTP) and Day-ahead Pricing (DAP)

[Sia14].

Time-of-Use Pricing In ToU pricing, each customer pays a higher amount

of money (on-peak prices) for the peak hours during the day and lower (off-peak)

prices during the night. Prices paid for energy consumed during these periods

are pre-established and known to consumers in advance. Typically, the prices

are not changing more often than twice a year (e.g. winter price schedule and

summer price schedule) [WL11]. Among all the ToU pricing programs, Economy

7 is adopted by UK electricity suppliers to provide 7 hours of cheap off-peak

electricity during the night. Prices during the rest of the time are, by contrast,

relatively expensive. Figure 2.3 shows what the Economy 7 Time-of-Use pricing

looks like.

Critical-Peak Pricing Due to the fact that the highly peaked demand

only last for a small number of days during a year, mainly because of the too hot

Page 33: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 33

0 5 10 15 20 250

2

4

6

8

10

12

14

16

Hour ending

Pric

e(p/

kwh)

Figure 2.3: Time-of-Use Pricing (Economy 7, UK)

Mon Tue Wed Thu Fri Sta Sun0

5

10

15

20

25

30

Day of week

Pric

e (c

ents

/kw

h)

Figure 2.4: Example of Critical Peak Pricing

Page 34: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 34

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM6

6.5

7

7.5

8

8.5

9

Hour ending

Pric

e (c

ents

/kw

h)

Figure 2.5: Example of Real-time Pricing

or too cold weather, another pricing scheme – Critical peak pricing (CPP) has

been proposed. CPP can be seen as an improved ToU tariff that traces critical

supply periods dynamically. Compared to time-of-use pricing, the day is divided

into more periods, e.g. peak, off-peak, and ‘shoulder’ periods. In CPP, it charges

customers extremely high/critical prices when the system or market conditions

meet pre-defined criteria [Her07]. Figure 2.4 shows an example of Critical Peak

Pricing program.

Real-time Pricing Although ToU pricing and CPP schemes show bene-

fits to the overall power system, they cannot reflect variations of the prices in

the wholesale market in real time, and thus are unable to effectively incentivise

customers to lower their energy usages during peak-demand periods or to shift

their energy usages from high-demand periods to low-demand periods. RTP is

an effective solution to the above problem. Under RTP, the price of electricity

varies at different hours of the day to reflect the varying prices at the wholesale

level in real-time [MRWJ+10a]. Figure 2.5 is an example of RTP program.

Day-ahead Pricing Under DAP, the customers will receive the prices for

the next 24 hours the night before the delivery time, and these prices will be

Page 35: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 35

fixed in the day of consumption. The advantage of DAP over RTP lies in that

the customers will have the price certainty over the next 24 hours so that they

can plan their energy usage accordingly based on the announced prices [JT13].

2.2 Demand Response – Related Work

The demand response (DR) problems we are looking at in this thesis fall within

the category of time-based pricing DR programs. As the pricing based demand

response always involves interactions between the retailer and its customers, in

the following, we are going to investigate and explore the related work from the

customer perspective and the retailer perspective respectively. At the customer-

level, the state-of-the-art of customer demand modelling methods in response

to the time-differentiated prices are firstly discussed in subsection 2.2.1. At the

retailer-level, the related work on smart pricing design for the retailer by taking

into account the customers’ potential responses are explored in subsection 2.2.2.

2.2.1 Customer Demand Modelling

Given time-differentiated pricing signals, the customers might respond to the

price signals by shifting their energy usages from peak-demand times to off-peak

demand times or curtailing their energy usages during peak-demand times. How-

ever, how to exactly model customers’ behaviours in using the electricity when

faced with time-differentiated price signals is worth investigating. In the follow-

ing, we will explore the existing customer demand modelling methods including

theoretic utility function based demand modelling, home energy management

based demand modelling and customer behaviour learning based demand mod-

elling.

In theoretic utility function based demand modelling, it is assumed that cus-

tomers’ energy consumption patterns can be represented in the form of pre-

determined theoretic utility functions that model customers’ aggregated energy

consumption preferences or behaviours.

Instead of modelling customers’ aggregated energy consumption behaviours,

home energy management based demand modelling can model customers’ energy

consumption patterns of each appliance. It is assumed that the customers have

home energy management systems (HEMS) installed in their smart meters and

Page 36: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 36

the HEMS can provide customers with optimal energy usage schedules for each

appliance.

Note that for the above two customer demand modelling methods, they all

assume that the customers’ behaviour patterns are predetermined and can be

identified by the retailer via the two-way communication infrastructure which

exists between the retailer and its customers. Further, they assume that the cus-

tomers’ behaviour patterns will not change for a certain period (e.g. 24 hours)

once the predetermined consumption schedules are established, i.e. the customers

will follow the predetermined consumption schedules exactly. However, for some

customers, they might not have such HEMS installed in their smart meters to

predetermine the detailed energy consumption schedules for them. As a result,

they will schedule the usages of their appliance according to their personal knowl-

edge and preferences. Even for some customers installed with HEMS, they might

not be able to follow the predetermined energy consumption schedules sometimes.

In the above situations where uncertainties exist in customers’ energy usage pat-

terns, the retailer does not know customers’ electricity consumption behaviours.

Instead, the retailer has to learn customers’ energy consumption patterns via

historical energy usage data in order to establish the customer demand model.

Theoretic Utility Functions based Demand Modelling

In an electricity market, different customers need different levels of electricity,

and might feel different levels of satisfaction with the same price and amount

of consumed electricity [BY13]. The customers’ preferences and their energy

consumption patterns can be represented in the form of utility functions, which

are based on the concepts from microeconomics [MCWG95] [SMRS+10b].

The utility function for customers can be defined as U(x, ω), where x is the

electricity consumption level of the customer and ω is a parameter which might

vary among customers and at different times of the day. The utility function

represents the customer’s level of satisfaction in electricity consumption. The

utility functions are assumed to have the following properties [SMRS+10b]:

• Property 1: Utility functions are non-decreasing, which means that cus-

tomers always intend to consume more electricity before reaching the maximum

consumption level. As a result, we have:

∂U(x, ω)

∂x≥ 0. (2.1)

Page 37: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 37

• Property 2: The marginal benefit can be defined as the additional satis-

faction or utility that a person receives from consuming an additional unit of

a good or service [MCWG95]. According to the above definition, the marginal

benefit of energy customers is ∂U(x,ω)∂x

, which is a non-increasing function. The

above means that the utility functions are concave and the level of satisfaction

for customers gradually gets saturated.

∂2U(x, ω)

∂2x≤ 0. (2.2)

• Property 3: It is assumed that, for a fixed consumption level x, a larger ω

implies a larger U(x, ω), i.e.:

∂U(x, ω)

∂w≥ 0. (2.3)

• Property 4: It is assumed that no electricity consumption brings no benefit,

i.e.:

U(0, w) = 0,∀ω > 0. (2.4)

There are different utility functions satisfying the above properties. One com-

monly used in power consumption modelling is quadratic utility functions with

linear decreasing marginal benefits [SMRS+10b]:

U1(x, ω) =

ωx− α2x2 if 0 ≤ x ≤ ω

α

ω2

2αif x ≥ ω

α

(2.5)

where α is a pre-determined parameter. Figure 2.6 shows examples of utility

functions and marginal benefit functions under U1(x, ω).

Another form of utility function can be defined as follows [MZZ+13]:

U2(x, ω) = ω ln(β + x). (2.6)

Note that β needs to be set to 1 to satisfy Property 4. Similarly, examples

of utility functions and marginal functions under U2(x, ω) can be found in Figure

2.7.

Suppose a customer consumes x kWh electricity at a rate of P dollars per

kWh, then he/she will be charged Px dollars. As a result, the welfare function

of the customer (take U2(x, ω) for example) can be represented as

Page 38: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 38

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

1.2

1.4

Electricity Consumption (Kwh)

Util

ity /

Mar

gina

l Ben

efit

of C

usto

mer

Utility Function (w = 1)Utility Function (w = 0.5)Utility Function (w = 0.3)Marginal Benefit Function (w = 0.3)

Figure 2.6: Sample utility functions and marginal benefit function for customersunder U1(x, ω) and α = 0.3.

0 50 100 150 2000

1

2

3

4

5

6

Electricity Consumption (Kwh)

Util

ity /

Mar

gina

l Ben

efit

of C

usto

mer

Utility Function (w = 1)Utility Function (w = 0.5)Utility Function (w = 0.1)Marginal Benefit Function (w = 0.5)

Figure 2.7: Sample utility functions and marginal benefit function for customersunder under U2(x, ω) and β = 1.

W2(x, ω) = U2(x, ω)− Px (2.7)

Page 39: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 39

For each announced price value P , each customer tries to adjust his/her elec-

tricity consumption x to maximize its personal welfare, and the customer’s con-

sumption function can be achieved by setting the derivative of (2.7) with respect

to x to zero. That is,

D(P ) =ω

P− 1 (2.8)

Home Energy Management based Demand Modelling

Due to the fact that the customers might not have enough knowledge and expe-

riences on how to respond to the electricity price signals, [MRLG10a] provides

an optimal residential energy scheduling scheme for home appliances in response

to time-differentiate prices, where the smart meters and the two-way communi-

cation infrastructures are assumed to exist between the energy retailer and its

customers.

Let A denote the set of appliances in one residential house. For each appliance

a ∈ A, an energy consumption scheduling vector xa = [x1a, ..., x

ha, ..., x

Ha ] is defined,

where H is the scheduling horizon that indicates the number of hours ahead which

are taken into account for decision making in energy consumption scheduling.

For each upcoming hour h, a real-valued scalar xha denotes the corresponding

1-h energy consumption that is scheduled for the appliance. [MRLG10a] firstly

proposes a bill minimization problem, which is as follows.

minxha

H∑h=1

ph ×(∑a∈A

xha

)s.t.H∑h=1

xha = Ea,

γmina ≤ xha ≤ γmaxa ,∑a∈A

xha ≤ Emax

(2.9)

where ph is the given price signal at hour h, Ea is the total energy consumption to

finish the operations of appliance a, γmina and γmaxa represent the standby power

level and maximum power level of each appliance a and Emax represents the

maximum total energy consumption at each residential unit at each hour h.

As some customers may delay their energy use to later times (low price peri-

ods) to reduce their energy bills, it will however cause some inconvenience (e.g.

Page 40: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 40

long waiting times) to such customers. As a result, another important thing

worth mentioning here is how to model customers’ comfort levels when using

certain types of appliances such as washing machines. To solve this, [MRLG10a]

further proposes a waiting cost model by adding the waiting cost term to the

optimization objective where they define the waiting cost as products of wait-

ing parameter and electricity consumption of each appliance over the scheduling

horizon. Therefore, the cost of waiting can be modelled as follows:

H∑h=1

∑a∈A

ρha × xha (2.10)

where for each appliance a ∈ A and each hour h ∈ H, the waiting parameters

ρha ≥ 0.

Finally, the energy consumption scheduling problem for one single residential

customer is formulated as follows:

minxha

H∑h=1

ph ×

(∑a∈A

xha

)+ λwait

H∑h=1

∑a∈A

ρha × xha (2.11)

where parameter λwait is used to control the importance of the waiting cost term

in the objective function.

[CKS11] simplifies the above waiting cost scheme and employs a hard-coded

unit waiting cost. As a result, the waiting cost is defined as products of unit

waiting cost and the number of hours waiting. Note that we are using same

notations as in [CKS11]. Finally, given the price vector P = p1, p2, ..., pH for

the time horizon, the optimal scheduled start time s∗ for appliance a is obtained

by solving the following optimization problem:

mins

(s− t0)× ψa +s+Ta∑h=s

ph × Ea (2.12)

where t0 is the time slot requested to turn on originally and each time slot of

delay for appliance a incurs a cost of ψa dollars. Ta denotes the non-interruptible

operations duration of appliance a and the power usage for this duration of appli-

ance a is Ea kW. Note that in the above model, the hourly energy consumption

maintains at the same level.

Instead of studying each single customer’s response to the price signals, [MRWJ+10a]

Page 41: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 41

extends their work [MRLG10a] by proposing an autonomous demand-side man-

agement scheme, where they formulate an energy consumption scheduling frame-

work among customers. In the following, we are going to use the same notations

as in [MRWJ+10a].

LetN denote the set of customers, where the number of customers is N = |N |.For each customer n ∈ N , the set of appliances in each household is defined as

An. For each appliance a ∈ An, an electricity consumption scheduling vector is

defined as xn,a = [x1n,a, ..., x

hn,a, ..., x

Hn,a]. Let H = 1, 2, ...H. For each customer

n, the total load at hour h can be obtained as

lhn =∑a∈An

xhn,a, h ∈ H (2.13)

As a result, the total load across all customers at each hour of the day h ∈ Hcan be defined as:

Lh ,∑n∈N

lhn (2.14)

Then the daily peak and average load are calculated as

Lpeak = maxh∈H

Lh (2.15)

Lavg =1

H

∑h∈H

Lh (2.16)

As a result, the peak-to-average ratio (PAR) of the power system can be

defined as

PAR =LpeakLavg

=H maxh∈H Lh∑

h∈H Lh(2.17)

Finally, the peak-to-average ratio minimization problem can be formulated as

minxn,a

H maxh∈H

(∑n∈N

∑a∈An

xhn,a

)∑n∈N

∑a∈An

En,a(2.18)

where En,a stands for the total electricity consumed in the operation duration of

appliance a ∈ An.

Page 42: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 42

Note that H and∑n∈N

∑a∈An

En,a are predetermined and fixed, then the problem

can be rewritten as

minxn,a

maxh∈H

(∑n∈N

∑a∈An

xhn,a

)(2.19)

By introducing a new auxiliary variable Γ and rewriting the above problem

formulation, the equivalent form is given as follows:

minΓ,xn,a

Γ

s.t. Γ ≥∑n∈N

∑a∈An

xhn,a, h ∈ H(2.20)

Now the problem becomes a linear program and can be solved using either

simplex method or interior-point method(IPM).

The above papers provide a new picture of home energy management based

customer demand modelling research in the context of smart grids. Other related

work includes [LCL11,AW14,OSKL13,ZLSS13,ZMPM13,QZHW13,LYH+14].

Customer Behaviour Learning based Demand Modelling

As discussed in the above, much related work on demand response management

takes the assumption that customers’ smart meters are installed with home energy

management systems (HEMS), i.e. optimization software such as to minimize

customers’ bills or maximize their life comforts. In other words, the retailer

can know the electricity consumption patterns of such customers by interacting

with these customers via the two way-communication infrastructure. However,

for many customers, they may not have such a HEMS installed at the moment.

Further, even if a customer has a HEMS, it is still very difficult or impossible

for a retailer to identify the right utility function to model the customer’s usage

behaviour due to various uncertainties in usage patterns. As a result, to determine

the retail prices, the retailer has to learn the customers’ energy consumption

patterns.

Current studies on learning customers’ energy consumption behaviours mainly

focus on understanding the aggregated energy usage patterns of customers, i.e.

they either learn the aggregated response of a pool of customers to the price

signals or learn the aggregated response of each individual customer to price

Page 43: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 43

signals [GCBK12] [TK13] [MYEH10] [KZE02] [HNG+13]. [GCBK12] utilizes lin-

ear regression models to learn the price-elasticity of demand, which gives an

aggregated response of the customers to price signals in one distributed power

system. [TK13] proposes an agent based model to study the customers’ price

elasticity of demand and the economic effects on electricity markets, which shows

that reductions in price spikes, customers’ bills and emissions of greenhouse gases

and other pollutants can be achieved when customers respond to the price sig-

nals. [MYEH10] investigates the responses of different types of customers (resi-

dential, industrial and commercial) reacting to the price signals in the smart grid

environment and concludes that, with smart grid technologies, it can bring all the

customers of any type of demand (e.g. price-taking demand and price-responsive

demand) to the market actively. [KZE02] presents a neuro-fuzzy approach to

short-term load forecasting in the price sensitive environment, in which the cus-

tomers are assumed to react to the prices signals. [HNG+13] proposes a daily

load curve forecasting model for residential customers based on a time series and

stochastic regression framework where the customers are assumed to respond to

the price signals.

2.2.2 Smart Pricing Design for the Retailer

Smart pricing schemes which influence customers’ energy usage decisions via price

signals is a very promising option for demand response management. The next

question is how can we design an efficient smart pricing scheme for the retailers

by taking into account the customers’ responses to benefit both the retailer and

its customers?

Although there is some related work on customer behaviour learning based

demand modelling, there seems to be no literature on customer behaviour learn-

ing based smart pricing design. As a result, in this subsection, we only explore

the smart pricing design methods from two perspectives: single-level optimiza-

tion based approaches [SMRS+10a,LCL11] and two-level game theory based ap-

proaches [BYL11,CKS11,CYG12,YTN13,QZHW13,MZZ+13,ZMPM13,CCYZ14].

Nevertheless, we will highlight our contributions in customer behaviour learning

based demand modelling and the corresponding smart pricing design methods in

Section 2.3.

For two-level game theory based smart pricing design approaches, the Stackel-

berg game which represents the hierarchical interactions between different players

Page 44: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 44

is a natural choice. Game theory is a branch of decision theory that attempts to

model the interactions between decision-making agents. In short, game theory

deals with any problem in which players can affect each other.

In the following, we firstly give the definition of the Stackelberg game and its

solution method, followed by the details of the bilevel optimization model (an

equivalent form of the Stackelberg game) and its solution method. Finally, the

related work on smart pricing design methods are explored.

Stackelberg Game

Stackelberg games, which are also called leader-follower games, were initially

proposed by Stackelberg in 1952 [VS52].

In a two-person Stackelberg game, it is assumed that one player selects his

strategy first, then the other player selects his strategy second in response. The

player who selects/announces his strategy first is called the leader and the player

who selects/responds with his strategy second, is called the follower. The Stackel-

berg problem is then to find an optimal strategy for the leader, assuming that the

followers react in such a rational way that they maximize their pay-off functions

given the leader’s actions.

In a general Stackelberg game with one leader and N followers, the leader

firstly chooses and announces its strategy uL from its strategy space UL. After

knowing the leader’s strategy, the other N players, called the followers will play

a non-cooperative game with each other to obtain the Nash equilibrium (NE)

among themselves. Let UL and UFi(i = 1, 2, ...N) be the strategy space for the

leader and each follower respectively.

Given the leader’s strategy uL, assume the followers’ goals are to maximize

their pay-off functions JFi(i = 1, 2, ...N), a Nash equilibrium for this N -player

game can be obtained by solving the following problem:

JFi[Ri(uL), R−i(uL);uL] = max

uFi

JFi[uFi

, R−i(uL);uL], i = 1, 2, ..., N (2.21)

where R−i(uL) represent the Nash strategies of all the other followers except

follower i given the leader’s strategy.

Assume the leader’s goal is to maximize its pay-off function JL, and the above

Page 45: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 45

N -player game admits a unique Nash equilibrium, then the Stackelberg equilib-

rium for the leader can be obtained by solving the following problem:

JL[u∗L;R1(u∗L), R2(u∗L), ..., RN(u∗L)] = maxuL

JL[uL;R1(uL), R2(uL), , ..., RN(uL)].

(2.22)

By defining the Stackelberg equilibrium for follower i as u∗Fi= Ri(u

∗L), then

the strategy vector (u∗L, u∗F1, ..., u∗FN

) is an optimal Stackelberg strategy to the

above general Stackelberg game with one leader and N -followers.

In the following and rest of this thesis, we consider a special case of the

above Stackelberg game, i.e. a Stackelberg game with one leader and N indepen-

dent followers. We do not need to use Nash equilibrium concepts in this special

case of a Stackelberg game as the actions of one follower do not affect the ac-

tions of the other followers. The rule of playing is as follows: firstly the leader

chooses and announces its strategy uL from its strategy space UL; After knowing

the leader’s strategy, the other N players, called the followers will decide their

best response strategies u∗Fi= Ri(uL)(i = 1, 2, ..., N) from their strategy spaces

UFi(i = 1, 2, ..., N) respectively. Assume that the followers’ goal is to maximize

their pay-off functions, then the followers’ best response strategies (reaction func-

tions) are defined as

u∗Fi= Ri(uL) = arg max

uFi∈UFi

JFi(uL, uFi

) i = 1, 2, ..., N (2.23)

Taking into account the followers’ reaction functions Ri(uL)(i = 1, 2, ..., N),

the leader’s goal is to find its best strategy which maximizes its pay-off function

JL(uL, u∗F1, ..., u∗FN

). That is, find

u∗L = arg maxuL∈UL

JL[uL, R1(uL), ..., RN(uL)] (2.24)

Definition 1. In the above Stackelberg game with one leader and N independent

followers, an optimal Stackelberg strategy is a strategy vector(uSL, u

SF1, ..., uSFN

)satisfying the following conditions:

uSFi= Ri(u

SL) = arg max

uFi∈UFi

JFi(uSL, uFi

) i = 1, 2, ..., N (2.25)

uSL = arg maxuL∈UL

JL[uL, R1(uL), ..., RN(uL)] (2.26)

Page 46: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 46

The Solution to the Stackelberg Game

The proposed leader-follower Stackelberg game model consists of two-stage, se-

quential decision-making problems. The common solution method for a multi-

stage Stackelberg game is Backward Induction [FT91].

In Backward Induction, it first considers the moves that are the last in the

game, and determines the best move for the player in each case. Then, taking

these as given future actions, it proceeds backwards in time, again determining

the best move for the respective player, until the beginning of the game is reached

[TvS01].

As the Stackelberg game is equivalent to the bilevel optimization model, the

solution methods for the latter such as Karush-Kuhn-Tucker (KKT) [BV04] con-

dition based methods can also be used to obtain the Stackelberg equilibrium,

which will be discussed in the following.

Bilevel Optimization

Bilevel optimization was first realized in the field of game theory, also known

as Stackelberg game. The bilevel optimization can be regarded as a Stackelberg

model with complete information, i.e. the leader in such a Stackelberg game

knows the utility function and strategy space of the followers. However, the

difference between them lies in that the Stackelberg game is more commonly

researched from the application perspective, i.e. model the interactions and game

behaviours between different players while bilevel optimization is more widely

known from the optimization perspective, i.e. design proper solution algorithms

to solve the problem.

The general formulation of a bi-level optimization problem with one upper

level decision agent and N independent lower level decision agents can be repre-

sented as follows [SMD12] [CG07]:

maxx,y1,...,yN

F (x, y1, ..., yN)

s.t.

yi ∈ argminy′i

fi(x, y′i) : gi(x, y′i) ≤ 0,

i = 1, . . . , N G(x, y1, ..., yN) ≤ 0

x ∈ X, yi ∈ Yi

(2.27)

Page 47: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 47

Note that the argmin defines yi as a function of x, and then the maximization

of upper-level objective function F is just maximizing a function of x essentially.

Further in the above formulation, F represents the upper-level objective function

and fi (i = 1, 2, ..., N) represent the lower-level objective functions. Similarly, x

is the decision vector of the upper level agent and yi is the decision vector of the

i-th lower level agent. G represent the constraint functions at the upper level and

gi represent the constraint functions of i-th lower level agent at the lower level.

X are the bound constraints for the upper level decision vector and Yi are the

bound constraints for each lower level decision vector.

A solution (x∗, y∗1, ..., y∗N) which maximizes the above objective function F (x, y1,

..., yN) subject to all the constraints is said to be a bilevel optimal solution.

The Solution to the Bilevel Optimization Problem

Solving the bilevel optimization problem is difficult. Even the simplest version of

the bilevel optimization problem where the objective functions and the constraints

are linear has been proven to be NP-hard [CMS05] [Jer85].

To solve the bilevel optimization problem, the common solution is to refor-

mulate it into a single-level problem by replacing the lower-level problem with its

Karush-Kuhn-Tucker (KKT) conditions. Note that such a solution method takes

the assumption that the lower-level problem is differentiable.

For the simplicity of illustrating the Karush-Kuhn-Tucker (KKT) condition

based approach, we consider the following bilevel model formulation with one

upper level decision agent and one lower level decision agent [CMS05]:

maxx∈X,y

F (x, y)

s.t. G(x, y) ≤ 0

miny∈Y

f(x, y)

s.t. g(x, y) ≤ 0

(2.28)

where x ∈ Rn1 and y ∈ Rn2 . The variables of problem (2.28) are divided into

two classes, namely the upper-level variables x and the lower-level variables y.

Similarly, the functions F : Rn1 × Rn2 → R and f : Rn1 × Rn2 → R are the

upper-level and lower-level objective functions respectively, while the functions

G : Rn1 ×Rn2 → Rm1 and g : Rn1 ×Rn2 → Rm2 are called the upper-level and

lower-level constraints respectively.

Page 48: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 48

The resulting single level reformulation of problem (2.28) via the KKT con-

ditions becomes:

maxx∈X,y,λ

F (x, y)

s.t.

G(x, y) ≤ 0,

g(x, y) ≤ 0,

λj ≥ 0, j = 1, 2, ...,m2,

λjgj(x, y) = 0, j = 1, 2, ...,m2,

∇yL(x, y, λ) = 0

(2.29)

where λj are the KKT multipliers and L(x, y, λ) = f(x, y) +∑m2

j=1 λjgj(x, y) is

the Lagrangian function associated with the lower level problem.

Due to the non-convexities that occur in the complementarity and Lagrangian

constraints such as λjgj(x, y) = 0, it is necessary to linearise these conditions

firstly via Fortuny-Amat linearisation [FAM81]. After linearisation, problem

(2.29) can be solved by existing optimization solvers such as CPLEX, Gurobi

[OPTa] or TOMLAB [OPTb].

In some cases where the lower-level problem consist of non-differentiable, dis-

continuous, or non-analytically functions, the KKT condition based methods are

not applicable. To overcome this, some heuristic and stochastic algorithms can

be used to obtain the optimal solution to the bilevel optimization problem.

Single-level Optimization based Smart Pricing Design

The work of [SMRS+10a] is a typical example of single-level optimization based

smart pricing design and concerned with how the retailers set the electricity

prices to maximize the welfare function, where they model the customer’s energy

consumption patterns in the form of theoretic utility functions. It focuses on

interactions between the retailer and the smart meters (customers) through the

exchange of control messages, which contain customers’ energy consumption and

real-time price information. They further model the energy cost imposed on the

retailer as a cost function. A cost function indicating the cost of generating or

distributing electricity by the retailer at each hour h is defined as Ch(Lh). It

is assumed that the cost function is increasing and convex [MRWJ+10a]. As

in this thesis we only consider electricity retail market which consists of the

retailer and the customers, it is not necessary and very difficult to explicitly

Page 49: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 49

take into account the renewable energy when modelling the retailer’s energy cost.

Nevertheless, the widely used energy cost model for the retailer such as quadratic

cost function [MRWJS10] [LCL11] or piecewise linear cost function [Hyd15] can

actually reflect the impact of integration of renewable energy at the supplier-

side in the wholesale market. For example, if the energy demand exceeds the

generation capacity of the conventional energy resources, to balance the supply

and demand, the supplier in the wholesale market has to use more expensive

renewable energy resources. As a result, the corresponding energy cost of the

supplier and the retailer will dramatically increase with the increase of energy

demand.

Finally, the welfare maximization problem is formulated using the utility func-

tion minus the cost function, which can be found as follows:

max∑

h∈H∑

n∈N U(xhn, whn)− Ch(Lh)

s.t.∑n∈N x

hn ≤ Lh,∀h ∈ H

(2.30)

where U(xhn, whn) represents the utility function of customer n in time slot h, xhn

is the amount of energy consumed by customer n at hour h and ωhn denotes the

ω parameter of the utility function of customer n in time slot h. The above

optimization problem can be solved using convex programming techniques in a

centralised manner. In order to do so, we need to know the exact utility functions

of customers. Since the utility parameter ωhn of each customer n is assumed to

be private, the retailer might not have sufficient information to solve the above

problem. To overcome this, they propose distributed algorithms which can be

executed at the customer-side and retailer-side respectively to solve Eq. (2.30).

Similar models and solution methods are proposed in [LCL11].

Two-level Game Theory based Smart Pricing Design

In the two-level game theory based smart pricing design, they deal with how

the retailers determine the electricity prices based on the expected responses

of customers where they represent the interactions between the retailer and

its customers as a Stackelberg game [BYL11,CKS11,CYG12,YTN13,QZHW13,

MZZ+13,CCYZ14] or a bilevel optimization model [ZMPM13,CAC09].

From the game-theory perspective, since the Stackelberg game is a multi-stage

Page 50: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 50

game, backward induction [FT91] can be used to solve for the equilibrium. It is

common in the Stackelberg game based smart pricing design that the energy

retailer determines and announces the electricity price first, and then customers

adjust the amount of electricity they use accordingly in response to the price

signals. According to the backward induction principle, the first step is to find

a customer’s optimal demand response to the price set by the retailer, and the

second step is to plug the optimal demand response which is a function of the price

into the objective function of the retailer and then optimize the retailer’s objective

function to find the optimal pricing based on customers’ response [YTN13].

To avoid the difficulties in solving practical game theory models, some works

[BYL11, CYG12, MZZ+13, CCYZ14, YTN13] model the customer-side problem

using theoretic household utility functions. For example, [BYL11] proposes a four-

stage Stackelberg game to model the interactions among generators, the retailer

and the customers where the customer-side problem is modelled using theoretic

household utility functions. Similarly, [CYG12] models the load uncertainty in the

optimal demand response scheduling scheme for the smart grid where theoretic

utility functions are adopted to model customers’ preferences.

More specifically, in [BYL11], the customer-side problem is modelled to max-

imize each customer’s utility while the retailer-side problem is modelled to max-

imize the retailer’s profit. The customer-side problem is modelled via a utility

function where the utility of an arbitrary user i can be defined as Ui(p, di) =

Xidi − αi

2d2i − pdi, where Xi is a parameter that may vary among customers, di

denotes the electricity consumption level of customer i, αi is a pre-determined pa-

rameter, and p is the price provided by the retailer. For the retailer-side problem,

they divide the electricity sources from which the retailer will buy energy into

two types: cheaper but uncertain (option I) and expensive but certain (option

II) and the energy cost imposed on the retailer is modelled using a linear cost

function. The details of their proposed four-stage Stackelberg game are given as

follows:

• Stage I: The electricity retailer, as the Stackelberg leader, first decides the

amount of electricity procured from electricity source option I.

• Stage II: The retailer decides the amount of electricity procured from elec-

tricity source option II, based on the electricity level procured in stage I.

• Stage III: The retailer decides the real-time price to offer to the customers

based on the total electricity supply.

Page 51: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 51

• Stage IV: The customers, who are the followers in the Stackelberg game,

adjust their individual electricity demand to maximize their individual utility.

For some other works such as [CKS11, ZMPM13], instead of using theoretic

utility functions, they model one type of home appliance with details for the

customer-side problem and represent the interactions between the retailer and its

customers as a two-level Stackelberg game or a bilevel optimization problem.

For example, [CKS11] models the interactions between the retailer and its

customers as a 1-leader, N -follower Stackelberg game. Firstly, they give the

detailed customer-side model for one type of appliance (see Eq.(2.12) for details).

Secondly, they model the retailer-side problem as a profit maximization prob-

lem. The objective function of the retailer is defined as the gross profit, GP ,

which equals to the revenue, i.e.∑T

t=1 πtptn,a subtracting the cost of energy usage

to the provider where πt is the retail price, ptn,a is the energy consumption at time

slot t of appliance a for customer n and T is the appliance scheduling horizon.

The energy cost to the provider has two parts: one cost Ce =∑T

t=1 øtptn,a comes

from purchasing energy for this appliance from the wholesale market where øt is

the wholesale price at time t and the other cost Cm is due to the “mismatch” be-

tween the actual load and planned supply caused by this appliance. As a result,

the profit maximization problem for the retailer is modelled as follows:

maxπt

T∑t=1

(πt − øt)ptn,a − Cm (2.31)

Finally, the above proposed 1-leader, N -follower Stackelberg game is solved

by distributed algorithms executed at the retailer-side and the customer-side

respectively.

[QZHW13] propose an optimal demand response scheme via price control

where they model three types of home appliances in customer-side problem and

use a two-level game model to represent the interactions between the retailer and

its customers.

More specifically, they categorize the home appliances into background (in-

elastic) appliances denoted as Au (e.g. lighting), elastic appliances denoted as Bu(e.g. air-conditioning) and semi-inelastic appliances denoted as Cu (e.g. washing

machine and dishwasher). For elastic appliances Bu, the customers usually have

higher satisfaction for more energy consumed per unit time (with a maximum

consumption upper bound). As a result, the customer-side problem (P1), which

Page 52: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 52

is given below, is to maximize the pay-off between maximizing customer’s sat-

isfaction, i.e. S =∑

h∈H∑

au∈Bu Uau,h(eau,h) and minimizing the customer’s bill

payment, i.e. P =∑

h∈H ph(∑

au∈Au,Bu,Cu eau,h) where au represents the appliance

of user u, H is the scheduling window, ph is the price at time h provided by the

retailer, and eau,h represents the energy consumption of appliance au at time slot

h.

P1: maxeau,h

(S − P ) (2.32)

When solving Problem P1, the optimal total energy consumption from all users

in each time slot depends on the price vector p = [p1, ..., pH ]. Let Su,h(p) denote

the corresponding optimal total energy consumption of user u at time slot h.

As a result, the revenue of the retailer is∑

h (∑

u Su,h(p)) ph, and the profit

maximization problem of the retailer is obtained as the difference between its

revenue and its energy procurement cost:

P2: maxp

∑h

(∑u

Su,h(p)

)ph −

∑h

Ch

(∑u

Su,h(p)

)(2.33)

where Ch(∑

u Su,h(p)) is the hourly energy procurement cost from the wholesale

market imposed on the retailer [Kot11].

Finally, the proposed two-level game model (Problems P1 and P2) is solved

via the simulated annealing [Rut89] based distributed algorithms.

From the optimization perspective, finding the best dynamic pricing scheme

for the retailer often requires solving a bi-level optimization problem which repre-

sents the interactions between an energy retailer and its customers. In the current

bi-level optimization theory, the common solution is to cast the bi-level optimiza-

tion problem into an equivalent single-level optimization problem by replacing

the lower level problems with their Karush-Kuhn-Tucker (KKT) optimality con-

ditions and solve the resulting single level problems. Such a KKT conditions based

approach has been used by the existing research such as [ZMPM13] and [CAC09]

to solve the pricing optimization problems for retailers. In [ZMPM13], a bilevel

approach to model the interactions between the retailer and its customers in a

demand response environment is proposed. The paper demonstrates that the

proposed bilevel model can be reformulated as a single-level Mixed Integer Lin-

ear Programming (MILP) problem by replacing the lower level problem with its

KKT conditions. [CAC09] proposes a bilevel optimization approach to solve the

Page 53: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 53

decision-making problems faced by the retailer. Similarly, the non-linear bilevel

optimization problem is cast into a single-level MILP problem via the KKT con-

dition based approach.

Finally, Table 2.2 shows a classification of the related smart pricing design

methods for the retailer.

Page 54: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 54

Tab

le2.

2:Sum

mar

yT

able

ofSm

art

Pri

cing

bas

edD

eman

dR

esp

onse

Pap

er

Syst

em

Model

Cust

om

er-

side

Model

Solu

tion

Alg

ori

thm

s

[SM

RS

+10

a]Sin

gle-

leve

lO

pti

miz

atio

nT

heo

reti

cU

tility

Funct

ion

Dis

trib

ute

dA

lgor

ithm

s

[LC

L11

]Sin

gle-

leve

lO

pti

miz

atio

nD

etai

led

Applian

ceM

odel

Dis

trib

ute

dA

lgor

ithm

s

[BY

L11

,CY

G12

,YT

N13

]Sta

ckel

ber

gG

ame

Theo

reti

cU

tility

Funct

ion

Cen

tral

ised

Sol

uti

on

[MZ

Z+

13,C

CY

Z14

]Sta

ckel

ber

gG

ame

Theo

reti

cU

tility

Funct

ion

Dis

trib

ute

dA

lgor

ithm

s

[CK

S11

,QZ

HW

13]

Sta

ckel

ber

gG

ame

Det

aile

dA

pplian

ceM

odel

Dis

trib

ute

dA

lgor

ithm

s

[ZM

PM

13,C

AC

09]

Bilev

elO

pti

miz

atio

nD

etai

led

Applian

ceM

odel

KK

Tbas

edA

ppro

ach

Page 55: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 55

2.3 Critical Analysis

In this section, we will discuss the limitations of existing literature and highlight

our research motivations. We are going to conduct the critical analysis from five

aspects: utility function and home energy management based demand modelling,

customer behaviour learning based demand modelling, optimization and game

theory based smart pricing design, smart pricing computation and customer be-

haviour learning based smart pricing design.

2.3.1 Utility Function and Home Energy Management Based

Demand Modelling

Despite that the above mentioned and more unreferenced work in terms of utility

function and home energy management based demand modelling [MRLG10b,

CKS11,MRLG10a,MRWJ+10a,AW14,LYH+14] have provided valuable methods

and results, there are still notable gaps or weaknesses in the existing literature:

Firstly, the existing research has failed to model certain important types of

appliances commonly used in most households. For example, all works given

in [MRLG10a] and [MRWJ+10a, AW14, LYH+14] have not addressed the usage

optimization modelling problem for curtailable appliances (such as air condition-

ing), which is needed to be solved in order to support customers using such types

of appliances most cheaply and most beneficially.

Secondly, for interruptible appliances and non-interruptible appliances, the

existing waiting time cost or benefit model [MRLG10b] [CKS11] is a pure theo-

retic one which is impossible for most ordinary customers to set up and use and

therefore inapplicable.

Noticing these shortcomings, the first motivation of our research is to propose

a home appliances based energy management system which models most com-

monly used types of home appliances including curtailable appliances and most

possible types of applications to find the best usage and scheduling scheme for cus-

tomers and to develop an applicable and implementable waiting time cost model

which is usable by ordinary customers. The details of our proposed home energy

management system which overcomes the shortcomings of existing research can

be found in Section 3.3 of Chapter 3 and Section 4.4 of Chapter 4.

Page 56: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 56

2.3.2 Customer Behaviour Learning Based Demand Mod-

elling

Although the papers [GCBK12] [TK13] [MYEH10] [KZE02] [HNG+13] discussed

in preivous “Customer Behaviour Learning based Demand Modelling” part present

valuable findings, our proposed approach is different from these existing ap-

proaches in the following aspects. Our work, which categorizes the home ap-

pliances into shiftable appliances and curtailable appliances according to their

load types [MZ13] and proposes different learning models for different types of

appliances, is missing in the above literature. The importance of such appliance-

level behaviour learning models lies in the fact that different types of appliances

have different load patterns and thus different learning models are needed to accu-

rately model customers’ behaviours. Furthermore, our proposed models are built

for each individual customer rather than aggregated customers. That is, for each

load type, different behaviour and usage patterns between different customers

in responding to the price and temperature signals are able to be identified. In

other words, our proposed appliance-level and individual-level behaviour learning

models are more comprehensive and accurate than the existing literature.

Based on the above analysis, the second motivation of our research is to design

accurate and comprehensive individual customer behaviour learning models down

to the appliance-level for the retailer with the purpose of retail price determination

and customer behaviour analysis. The details of our proposed customer behaviour

learning model which fills the gap of the existing research can be found in Section

5.4 of Chapter 5.

2.3.3 Optimization and Game-theory based Smart Pric-

ing Design

In terms of single-level optimization and two-level game-theory based smart pric-

ing design, the pricing optimization problems and models by the existing research

[SMRS+10a,BYL11,CKS11,CYG12,MZZ+13,ZMPM13,CCYZ14,QZHW13] are

either oversimplified or unrealistic from an application point of view. For exam-

ple, in single-level optimization based smart pricing design, [SMRS+10a] adopts

theoretic household utility functions rather than based on real home appliances

to model customers’ behaviours. As a result, the resulting smart pricing design

model is not accurate enough as it does not take into account customers’ real

Page 57: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 57

responses.

Although [BYL11, CKS11, CYG12, MZZ+13, ZMPM13, CCYZ14] adopt the

Stackelberg game like our work, they either only model one type of appliances in

the customer-side problem or fail to give the realistic and explicit form of cus-

tomers’ utility functions. In other words, such problem formulations and mod-

els are only partially or unrealistically modelling a retailer’s pricing optimization

problem and therefore are insufficient. Further, the difference between [QZHW13]

and our work lies in that our proposed customer-side problem, which considers

most commonly used types of appliances and possible applications as well as an

effective customer waiting cost model, is more comprehensive and accurate. As

a result, our considered retail pricing problem for the demand response manage-

ment is more practical.

For the above reasons, the third motivation of our research is to develop

a complete and comprehensive pricing optimization model that accurately and

realistically represents the real pricing problem faced by retailers to enable the

usability and applicability of the resulting models. The details of our proposed

smart pricing design model can be found in Section 3.3 of Chapter 3 and Section

4.4 of Chapter 4.

2.3.4 Smart Pricing Computation

From the optimization perspective, finding the best dynamic pricing scheme

for the retailer often requires solving a bi-level optimization problem, which

represents the interactions between an energy retailer and its customers. The

common solution to this bilevel pricing game is the KKT condition based ap-

proach [ZMPM13] [CAC09]. However, such an approach is infeasible in real ap-

plications due to the fact that a retailer has thousands to millions of customers,

where each customer may have several constraints at the lower-level optimization

problems. When using KKT condition based approach, it will result in far too

many constraints for the resulting single-level problem which is infeasible to be

solved by existing optimization software.

Another important issue worth mentioning is the customers’ privacy concerns.

By replacing the lower level problem with its KKT conditions, the lower-level

problem will be exposed to the retailer and may cause privacy problems to cus-

tomers. For instance, the retailer may know the on/off time of appliances in the

customers’ houses and even the customers’ daily life patterns.

Page 58: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 2. BACKGROUND AND RELATED WORK 58

For the above reasons, the fourth motivation of our research is to develop a

hybrid approach to solve the retailer-side problem and customer-side problems

in a distributed way to overcome the infeasibility of the existing KKT condition

based approach. In terms of preserving the customers’ privacy claimed by our

proposed distributed hybrid optimization approach, we assume that there is a

third regulation party between the retailer and its customers. Each customer’s

energy consumption data is firstly uploaded to the third party and processed

there. The third party only passes the aggregated consumption data of a whole

region or an abstract model to the retailer for pricing optimization. That is, the

retailer does not have direct access to each individual customer’s data. The details

of our proposed distributed optimization algorithms can be found in Section 4.5

of Chapter 4.

2.3.5 Customer Behaviour Learning based Smart Pricing

Design

It is noted that the literature on appliance-level customer behaviour learning

based demand modelling is missing in the existing work, not to mention the

customer behaviour learning based smart pricing design. As a result, the fifth

motivation of our research is to design an efficient distributed pricing optimization

model based on the customer behaviour learning results for demand response

management. The details of our proposed customer behaviour learning based

smart pricing algorithms can be found in Section 5.5 of Chapter 5.

2.4 Chapter Summary

This chapter gives a detailed background and a thorough review of the existing

literature, which is necessary for understanding the rest of this thesis. We have

particularly looked at the smart grid related concepts and techniques, background

of demand response programs, which are followed by a detailed search of related

work on demand response, i.e. customer demand modelling and smart pricing

design. Finally, a critical analysis in terms of the existing literature has been

given.

Page 59: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Chapter 3

Smart Pricing to Demand

Response Management I – A

Stackelberg Game Based

Approach

3.1 Introduction

In this chapter, we propose a Stackelberg game based smart pricing approach

to demand response management for the smart grid. Stackelberg game is an

ideal model to represent interactions between hierarchic decision-making agents,

i.e. the energy retailer and its customers in our problem. This chapter, which is

adapted from our published work [MZ12] [MZ13], can be seen as our first attempt

to solve the smart pricing problem for demand response management.

The rest of this chapter is organized as follows. Firstly, the problem statement

is given in Section 3.2. Secondly, the formulation of our proposed Stackelberg

game model is given in Section 3.3. More specifically, the energy usage schedul-

ing problem for customers (follower-side problem) is formulated in subsection

3.3.1 and the profit maximization model for the retailer (leader-side problem) is

formulated in subsection 3.3.2. Thirdly, the solution methods to the Stackelberg

game model are given in Section 3.4. This Section includes the proof of existence

of Stackelberg equilibrium (subsection 3.4.1) and the KKT condition based solu-

tion method (subsection 3.4.2). Fourthly, numerical results are given in Section

3.5 to show the benefits of our proposed model to the retailer and its customers.

59

Page 60: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 60

Figure 3.1: Structure of a Residential Power Network

This chapter is concluded in Section 3.6.

3.2 Problem Statement

We consider a residential power network shown as Figure 3.1 which consists of 1

retailer and N customers [CKS11]. It is assumed that each customer is equipped

with a smart meter. The retailer procures electricity from the wholesale market,

determines retail prices and sends the price information to customers (smart me-

ters) via local area network (LAN). The smart meters then manage the electricity

usage of home appliances in response to the price signal and transmit the electric-

ity demand information to the retailer. The interactions between the retailer and

its customers can be enabled through the two way communication infrastructure.

As it can be seen from the above interaction process, this is a 1-leader, N-follower

Stackelberg game, i.e. the customers make their best energy consumption deci-

sions based on the prices announced by the retailer and in turn the retailer designs

the best day-ahead prices by taking into account the customers’ responses.

The Stackelberg game considered in this chapter can be formulated as below:

There are N + 1 players, in which one player (i.e. retailer) is the leader

L and N other players (i.e. customers) are the followers Fn (n = 1, 2, ..., N).

The leader’s strategy is denoted as uL and its strategy space is UL whereas the

strategy of follower Fn is uFn and its strategy space is UFn . The pay-off function

Page 61: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 61

for leader L is JL(uL, uF1 , ..., uFN), whereas the pay-off function for follower Fn is

JFn(uL, uFn), which means that the follower’s pay-off is completely decided by

the leader’s strategy uL and its own strategy uFn and is independent to the other

followers’ strategies uFm (m = 1, 2, ..., N,m 6= n). The reason for this particular

form of pay-off functions for the followers is that each customer’s decision about

electricity usages is independent from other customers’ decisions.

In this game, each player’s goal is to maximize its pay-off function, where

each follower’s pay-off function is defined as minus one times its bill function

whereas the leader’s pay-off function is its profit function. The rule of playing is

: The leader (i.e. retailer) announces its strategy (i.e. prices) first, after knowing

the leader’s strategy, the followers (i.e. customers) selects their best reaction to

minimize their electricity bills.

The above Stackelberg game is a special case of general Stackelberg games with

one leader and multiple followers, because the followers’ pay-off functions take the

simpler form as JFn(uL, uFn) rather than JFn(uL, uF1 , ..., uFn , ..., uFN). As a result,

there is no need to use Nash equilibrium concept to define how the followers react

to the leader’s announced strategy, but each follower just simply selects its best

strategy to maximize its pay-off function. That is, for any announced strategy

uL ∈ UL, the reaction from follower Fn is to select its best strategy u∗Fnwhich

maximizes its pay-off function JFn(uL, uFn) with the given uL ∈ UL. As for each

leader’s strategy, there is a reaction strategy from follower Fn . Therefore the

follower’s reaction function can be defined as below:

u∗Fn= RFn(uL) = arg max

uFn∈UFn

JFn(uL, uFn) n = 1, 2, ..., N (3.1)

With the reaction functions RFn(uL) (n = 1, 2, ..., N) defined as above, we can

define Stackelberg strategy (equilibrium) concept below by following the standard

definition.

Definition 2. Let uSL ∈ UL be a strategy which maximize the leader’s pay-off

function as follows:

uSL = arg maxuL∈UL

JL[uL, RF1(uL), ..., RFN(uL)] (3.2)

and uSFn= RFn(uSL)(n = 1, 2, ..., N). Then

(uSL, u

SF1, ..., uSFN

)is called a Stackel-

berg strategy.

Page 62: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 62

3.3 Stackelberg Game Model Formulation

Firstly, our focus is to formulate the energy management problem in response to

the day-ahead prices in each household at the follower level. Secondly, we model

the profit maximization problem for the retailer who will offer the day-ahead

prices of next 24 hours to customers at the leader level.

Throughout this chapter, let N = 1, 2, ..., N denote the considered set of

customers, where N , |N | and H , 1, 2, ..., H denote the scheduling window.

Usually, H = 24.

We define the prices offered by the retailer as a price vector: P = [p1, p2, ..., ph,

...pH ], where ph represents the electricity price at hour h.

3.3.1 Energy Usage Scheduling for Customers – Follower

Level

We consider dividing the appliances into three different categories (i.e. non-

shiftable appliances, shiftable appliances and curtailable appliances). Lights and

refrigerator-freezer are examples of non-shiftable appliances. The operations of

this type of appliances are not time shiftable. For shiftable appliances, there are

dish-washers, washing machine, plug-in hybrid electric vehicles (PHEV), and etc.

Customers can shift the usages of this type of appliances from higher electricity

price periods to lower price periods. Air-conditioning and space heaters are two

examples of curtailable appliances. The operations of this type of appliances are

also not time shiftable, but the consumption levels of these appliances could be

adjusted if a customer feel that the prices during that period are too high.

For each customer n ∈ N , we define the set of appliances in each household

An. Furthermore, we define the set of shiftable and non-shiftable appliances Sn

and curtailable appliances Cn. Thus, we have An = Sn ∪ Cn and Sn ∩ Cn = ∅.In light of this, we can decompose the bill minimization problem into two sub-

problems as below because the appliances sets are not overlapped.

Shiftable and Non-shiftable Appliances

In this subsection, we integrate the non-shiftable appliances into the shiftable

appliances model and treat them together as one model.

This model improves that of [MRLG10b]. In their work, a upper limit for

Page 63: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 63

hourly electricity usage is set for each household, but we do not have such con-

straints for the optimization problem at customers’ side as there is no such usage

limits in practice. Instead, we consider the total upper limit of hourly usage

of all the customers served by the same retailer and put the constraint in the

optimization problem at retailer-side. This is to represent the maximum load

capacity of power networks. Therefore, we can actually control the hourly usage

of electricity of each household by properly determining the retail price, which is

more practical from an application point of view.

For each appliance s ∈ Sn , we define an electricity consumption scheduling

vector:

xn,s = [x1n,s, ..., x

hn,s, ..., x

Hn,s] (3.3)

where H is the scheduling horizon. For each hour h ∈ H , 1, 2, ..., H, xhn,s ≥ 0

represents the n-th user’s electricity consumption of appliance s at time h.

Firstly, the user needs to set a scheduling window Hn,s , αn,s, ..., βn,s by

specifying the beginning time of the window αn,s and the end time of the window

βn,s for shiftable appliance s. In the scheduling window, the energy usage of

the appliance can be shifted from high price period to low price period and

therefore the operations of the appliance could be interrupted. However, the

total electricity consumption needed to finish all the operations for appliance s is

fixed and can be defined as En,s. As a result, we have:

βn,s∑h=αn,s

xhn,s = En,s (3.4)

After defining the minimum power level γminn,s and the maximum power level

γmaxn,s for each appliance s ∈ Sn , we have

γminn,s ≤ xhn,s ≤ γmaxn,s ,∀h ∈ Hn,s. (3.5)

Note that the minimum power level γminn,s also known as standby power level which

refers to the electric power consumed by the appliance while it is switched off or

in the standby mode. In reality, the γminn,s is not always zero but a very small

amount of energy.

Then, the payment bill optimization problem for shiftable appliances can be

Page 64: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 64

modelled as follows:

min JSn = minxhn,s

∑s∈Sn

∑βn,sh=αn,s

ph × xhn,s

s.t.∑βn,s

h=αn,sxhn,s = En,s,

γminn,s ≤ xhn,s ≤ γmaxn,s ,∀h ∈ Hn,s.

(3.6)

Curtailable Appliances

Same as the shiftable and non-shiftable appliances, we define an electricity con-

sumption scheduling vector for each curtailable appliance c ∈ Cn as follows:

xn,c = [x1n,c, ..., x

hn,c, ..., x

Hn,c]. (3.7)

Similarly to shiftable and non-shiftable appliances, we assume that for each

appliance c, the customers need to select the beginning hour αn,c ∈ Hn,c and end

hour βn,c ∈ Hn,c for a valid scheduling interval. Compared with non-shiftable

and shiftable appliances, the time interval of curtailable appliances should be

more strict and accurate because the appliances will be ‘on’ for the whole time

interval. For example, for a typical summer weekday, it is not reasonable to select

the beginning hour αn,c = 7 PM and the end hour βn,c = 7 AM (the next day)

for the air conditioning, which instead is a perfect time interval for PHEVs or

washing machine. However, αn,c = 7 PM and βn,c = 12 AM could be a reasonable

time setting for air conditioning.

We define the minimum power consumption level γminn,c and the maximum

power consumption level γmaxn,c for each appliance c ∈ Cn and we have:

γminn,c ≤ xhn,c ≤ γmaxn,c ,∀h ∈ Hn,c. (3.8)

where Hn,c , αn,c, ..., βn,c.For each hour h ∈ Hn,c, we model the energy consumption of curtailable

appliance c at hour h as a linear function of electricity price at that hour. That

is, we define the linear function as fn,c(ph) = an,c×ph+bn,c. For each hour h ∈ H,

we define the minimum price that the retailer can offer pmin and the maximum

price the retailer can offer pmax. Further we assume that the user n ∈ N will

consume the power of γmaxn,c at price of pmin and γminn,c at price of pmax.

As we know two points on the linear function, i.e. (pmin, γmaxn,c ) and (pmax, γminn,c ),

Page 65: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 65

we can obtain an,c and bn,c as follows:

an,c =γmaxn,c − γminn,c

pmin − pmax

and

bn,c =γmaxn,c × pmax − γminn,c × pmin

pmax − pmin

As a result, we have:

xhn,c = fn,c(ph) = an,c × ph + bn,c,∀h ∈ Hn,c. (3.9)

To align with the shiftable and non-shiftable appliances model, we model the

payment bill optimization problem for curtailable appliances as follows:

min JCn = minxhn,c

∑c∈Cn

∑βn,ch=αn,c

ph × xhn,c

s.t.

γminn,c ≤ xhn,c ≤ γmaxn,c ,∀h ∈ Hn,c

xhn,c = fn,c(ph),∀h ∈ Hn,c.

(3.10)

Then the optimal electricity consumption scheduling problem for all appli-

ances of customer n can be modelled as (3.11).

min Jn = minJSn + JCns.t. constraints (3.4)− (3.5), (3.8)− (3.9).

(3.11)

Recall that we have An = Sn ∪ Cn and Sn ∩ Cn = ∅. We define an electricity

consumption scheduling vector for each appliance a ∈ An:

xn,a = [x1n,a, ..., x

hn,a, ..., x

Hn,a] (3.12)

where xhn,a represents the electricity consumption scheduling of appliance a at

time h. We can get xn,a by solving the minimization problem (3.11).

Page 66: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 66

3.3.2 Profit Maximization Model for the Retailer – Leader

Level

In this section, we model the profit of the retailer by using the revenue subtracting

the energy cost imposed on the retailer. We will discuss about the energy cost

model first, and then a profit maximization model will be proposed.

In the practical application scenario, to determine the retail price, we need to

consider many factors such as running cost of the retailers including the payments

incurred in the wholesale market and so on. For simplicity, we define a cost

function Ch(Lh) indicating the cost of providing electricity by the retailers at

each hour h ∈ H, where Lh represents the amount of power provided to all users

at each hour of the day. We assume that the cost function Ch(Lh) is convex

increasing in Lh for each h [MRWJ+10b, LCL11]. In view of this, we design the

cost function as follows [MRWJ+10b].

Ch(Lh) = ahL2h + bhLh + ch (3.13)

where ah > 0 and bh ≥ 0, ch ≥ 0 at each hour h ∈ H.

For each hour h ∈ H, as defined previously, by denoting the minimum price

that the retailer (utility company) can offer as pmin and the maximum price as

pmax, we have

pmin ≤ ph ≤ pmax. (3.14)

Note that there is usually a maximum load capacity, denoted as Emaxh , of

power networks at each hour. Thus, we have following constraints:

∑n∈N

∑a∈An

xhn,a ≤ Emaxh , ∀h ∈ H (3.15)

Then the profit maximization problem can be modelled as (3.16).

maxph∑h∈H

ph ×∑n∈N

∑a∈An

xhn,a −∑h∈H

Ch(∑n∈N

∑a∈An

xhn,a)

s.t.

pmin ≤ ph ≤ pmax∑n∈N

∑a∈An

xhn,a ≤ Emaxh ,∀h ∈ H

(3.16)

Page 67: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 67

3.3.3 A Two Stage Stackelberg Game Model

We model the interactions between the retailer and its customers as a 1-leader,

N-followers two stage Stackelberg game.

• Stage 1: The retailer determines the electricity prices P = [p1, p2, ..., ph, ...pH ]

to offer to the customers.

• Stage 2: The customers determine their individual electricity demand to

maximize their pay-off, given the price P .

3.4 Stackelberg Game Model Solutions

3.4.1 Existence of Stackelberg Strategy

In the previous section, we have shown that the optimal day-ahead pricing prob-

lem for demand-response management can be modelled as a Stackelberg game

problem, in which the customers as the followers need to find the optimal energy

consumption schemes or strategies to minimize their objective functions given in

Eq.(3.11), whereas the retailer as the leader needs to find the optimal day-ahead

price strategy to maximise its objective function given in Eq.(3.16). Before we

develop the algorithms which can find the optimal energy consumption schemes

for the customers and the optimal day-ahead price strategy for the retailer, we

need to know whether such optimal consumption schemes and optimal day-ahead

price strategy exist. In other words, whether the optimal Stackelberg strategy for

the considered Stackelberg pricing game problem exists? The following lemmas

and theorems gives a positive answer to this question.

Recall the definition of Stackelberg game and equilibrium (strategy) given as

Definition 2 in Section 3.2, when N = 1, the above definition gives the standard

definition of Stackelberg strategy for a two-person (i.e. one leader and one fol-

lower) Stackelberg game. In this case, the index in F1 will be omitted and denotes

as F . For two-person Stackelberg game, the following lemma about the existence

of Stackelberg strategy (equilibrium) was given and proved in [SCJ73].

Lemma 1. ( [SCJ73]) For a two-person Stackelberg game, if UL and UF are

compact sets, UL ⊂ RnL and UF ⊂ RnF , and if JL(uL, uF )and JF (uL, uF )are real-

valued continuous functions on UL×UF , then a Stackelberg strategy (equilibrium)

exists.

Page 68: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 68

Now using Lemma 1, the following theorem can be proved.

Theorem 1. For the considered one leader and N-follower Stackelberg game

given in Section 3.2, if UL and UFn (n = 1, 2, ..., N) are compact sets, UL ⊂ RnL

and UFn ⊂ RnFn (n = 1, 2, ..., N), and if JL(uL, uF1 , ..., uFN) and JFn(uL, uFn) (n =

1, 2, ..., N) are real-valued continuous functions on UL × UF1 × ... × UFN, then a

Stackelberg strategy exists.

Proof. Firstly consider the following two-person Stackelberg game: The leader’s

strategy is uL ∈ UL and the follower’s strategy isuF= (uF1 , ..., uFN

) ∈

UF=

UF1 × ... × UFN. Further the leader’s pay-off function is JL(uL,

uF ) and the

follower’s pay-off function is

JF (uL,uF ) =

N∑n=1

JFn(uL, uFn)

Under the given condition of Theorem 1, it is implied immediately that UL and

UF

are compact sets, UL ⊂ RnL and

UF⊂ R∑N

n=1 nFn , and JL(uL,uF )and JF (uL,

uF )

are real-valued continuous functions on UL×

UF . Then from Lemma 1, we have

that a Stackelberg strategy(uSL,

uS

F

)=(uSL, u

SF1, ..., uSFN

)exists. Based on the

definition of Stackelberg strategy, it is known

uSL = arg maxuL∈UL

JL[uL,

RF (uL)] (3.17)

with

RF (uL) =[RF1 (uL), ..,

RFN(uL)

]being the follower’s reaction function

and

uS

F=(uSF1

, ..., uSFN

)=

RF (uSL) =[RF1 (uSL), ...,

RFN(uSL)

](3.18)

Based on the definition of the reaction function given in Eq.(3.1), it is known

that

RF (uL) =[RF1 (uL), ..,

RFN(uL)

]= arg max

uF⊂UF

JF (uL,uF )

= arg max(uF1,...,uFN

)⊂UF1×...×UFN

∑Nn=1 JFn(uL, uFn

)

=∑N

n=1 arg maxuFn⊂UFnJFn(uL, uFn

)

(3.19)

Noting that RFn(uL) = arg maxuFn⊂UFnJFn(uL, uFn

)(n = 1, 2, ..., N), the

Page 69: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 69

above equality implies RFn(uL) =

RFn (uL) (n = 1, 2, ..., N) and thus

RF (uL) =[RF1 (uL), ..,

RFN(uL)

]= [RF1(uL), .., RFN

(uL)]

Substituting the above equality into (3.17) and (3.18), we have

uSL = arg maxuL∈ULJL[uL,

RF (uL)]

= arg maxuL∈ULJL[uL, RF1(uL), ..., RFN

(uL)](3.20)

(uSF1

, ..., uSFN

)=[RF1 (uSL), ...,

RFN(uSL)

]=[RF1(u

SL), ..., RFN

(uSL)] (3.21)

Based on Definition 2, the above imply that(uSL,

uS

F

)=(uSL, u

SF1, ..., uSFN

)is

a Stackelberg strategy and this ends the proof of the existence.

Theorem 2. Consider the Stackelberg game with one leader and N followers as

follows: 1) The leader’s objective function is defined in Eq.(3.16) and its strategy

space is defined by the constraints (3.14)-(3.15); 2) There are N followers, in

which the followers’ objective functions are defined by Eq.(3.11) and their strategy

spaces are defined by the constraints (3.4) – (3.5) and (3.8) – (3.9). Then the

optimal Stackelberg strategy exists.

Proof. For the Stackelberg game considered above, it can be verified as below

that the conditions of Theorem 1 hold.

1. For each customer’s billing minimization problem, the constraint set is

bounded in a finite dimension real space due to constraints shown as equation

(3.5) and (3.8). Further all other constraints are equality or less than and equal

inequalities. This implies that the constraint set is closed. As each bounded and

closed set in a finite dimension real space is compact, this ensures the compact

condition holds for each follower.

2. For the retailer’s profit maximization problem, the constraint set is bounded

in a finite dimension real space due to that each price is bounded by the minimum

and maximum price bounds. Further all other constraints are equality or less than

and equal inequalities. This implies that the constraint set is closed. For the same

reason as before, this ensures the compact condition holds for the leader.

3. The condition that the pay-off functions are continuous is obvious as the

pay-off functions for customers (minus one times billing minimization function)

are bilinear functions whereas the pay-off function for the retailer (profit maxi-

mization function) is a quadratic function.

Page 70: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 70

As the conditions of Theorem 1 hold, it implies immediately the existence of

Stackelberg strategy for the above considered Stackelberg game and this ends the

proof.

3.4.2 Problem Transformation and Solutions

As aforementioned, the proposed 1-leader, N -follower Stackelberg game is equiv-

alent to the bilevel program with one upper level decision agent and N lower level

decision agents. Further the common solution to the bilevel programs is to cast

them into equivalent single-level non-linear programming problems by replacing

the lower level problems with their Karush-Kuhn-Tucker (KKT) optimality con-

ditions. The resulting single level problem is NP-hard and can be approximately

solved by the existing solvers. Note that even for NP-hard problems, one the one

hand, we can still find optimal solutions sometimes using existing solvers; on the

other hand, in most real application cases, we are more interested in finding a

close-to-optimal solution as long as the solution is better than the original one.

KKT condition based approach can be applied in this problem because the upper

level variables ph can be regarded as parameters in the lower level problem and

the lower-level problems are linear in the continuous variables xn,a. As a result,

the solutions to the original Stackelberg game problem are also solutions to the

transformed single level problem and vice versa [CCMGB06]. Note that given

the upper level variables ph fixed, the objective function JCn of the curtailable

appliances model Eq.(3.10) is a constant. Therefore, the Lagrangian function

associated with the lower level problem Eq.(3.11) is only related to the shiftable

appliances model, which is defined as follows:

Ln = Jn +∑a∈Sn

λ1n,s(∑βn,s

h=αn,sxhn,s − En,s)

+∑s∈Sn

∑h∈Hn,s

µ1hn,s(xhn,s − γmaxn,s )

−∑s∈Sn

∑h∈Hn,s

µ2hn,s(xhn,s − γminn,s )

(3.22)

where λ1n,a, µ1hn,b and µ2hn,b are KKT multipliers associated with the lower-level

constraints (3.4) and (3.5) respectively.

In addition to the primal feasibility constraints (3.4) and (3.5), the KKT

necessary optimality conditions of the lower level problem are given as follows:

Page 71: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 71

∂Ln∂xhn,s

= ph + λ1n,s + µ1hn,s − µ2hn,s = 0 (3.23)

µ1hn,s ≥ 0 (3.24)

µ2hn,s ≥ 0 (3.25)

µ1hn,s(γmaxn,s − xhn,s) = 0 (3.26)

µ2hn,s(xhn,s − γminn,s ) = 0 (3.27)

where (3.23) represent the stationarity conditions, (3.24)-(3.25) are the dual fea-

sibility conditions and (3.26)-(3.27) are the complementary slackness conditions.

Note that the complementary slackness conditions are non-linear constraints. In

order to linearise these conditions, we make use of the Fortuny-Amat linearisa-

tion [FAM81]. As a result, Eq. (3.26) and Eq. (3.27) can be replaced by the

following constraints:

γmaxn,s − xhn,s ≤ z1n,s,hM

1, (3.28)

µ1hn,s ≤ (1− z1n,s,h)M

1, (3.29)

xhn,s − γminn,s ≤ z2n,s,hM

2, (3.30)

µ2hn,s ≤ (1− z2n,s,h)M

2, (3.31)

z1n,s,h, z

2n,s,h ∈ 0, 1. (3.32)

where the M1,M2 are sufficient large constants.

Finally, the equivalent single-level non-linear programming problem is shown

as follows:

max∑h∈H

ph ×∑n∈N

∑a∈An

xhn,a −∑h∈H

Ch(∑n∈N

∑a∈An

xhn,a)

subject to the constraints (3.14)-(3.15), (3.4) – (3.5), (3.8) – (3.9),

(3.23) – (3.25) and (3.28) – (3.32)

(3.33)

Note that the objective function of the resulting single level problem Eq.(3.33)

is a quadratic function and its constraints are either linear or having binary vari-

ables. As a result, problem Eq.(3.33) is a Mixed Integer Quadratic Programming

Page 72: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 72

(MIQP) problem and can be solved by the existing solvers such as Gurobi [OPTa]

or TOMLAB [OPTb].

3.5 Numerical Results

We simulate a neighbourhood consisting of 1000 customers served by one energy

retailer. We assume that each customer has 8 appliances. Recall that we have

appliances in three categories: shiftable appliances, non-shiftable appliances and

curtailable appliances. Note that the scheduling horizon is from 8AM to 8AM

(the next day).

For the cost of the energy provided to customers by utility company, we

model this as a cost function. We choose a quadratic cost function: Ch(Lh) =

ahL2h + bhLh + ch , where Lh represents the amount of power provided to all users

at each hour of the day. For simplicity we assume that bh = 0, ch = 0 for all

h ∈ H. Also, we have ah = 5.5 × 10−4 cents during the day, i.e. from 8AM to

12AM and ah = 4.0 × 10−4 cents at night hours, i.e. from 12AM to 8AM (the

next day).

Observing the load patterns of curtailable appliances, we categorize users into

three categories: sensitive users, mid-sensitive users and insensitive users. For

sensitive users, they will reduce their energy consumption of curtailable appliances

when the price goes up and vice versa. For insensitive users, they are immune to

the price signals and will keep at the same level of energy consumption as usual.

Finally, the parameter settings of each category of appliances for different

types of users are given in Tables 3.1 3.2 3.3 3.4 3.5 3.6.

Due to the fact that different literature uses different scheduling and pricing

models as well as different datasets, there are no available benchmarks to com-

pare our results against. As a result, to implement the evaluations and with the

purpose to design a proper benchmark for energy scheduling, we assume that,

without our proposed optimal scheduling scheme, the appliances start the op-

eration right at the beginning of the time interval Ha and at its typical power

level.

The following evaluations are implemented in two parts where the first part

evaluates the benefits of our proposed energy scheduling framework to customers

and energy retailer given the public dynamic electricity price data and the sec-

ond part evaluates the feasibility of our proposed smart pricing scheme and the

Page 73: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 73

benefits to customers and energy retailer under the obtained optimized dynamic

electricity prices.

Table 3.1: Shiftable Appliances’ parameters for each home

Appliance Name Es Hs γmins γmaxs

Dish washer 1.8kwh 8PM-6AM 0.1kwh 1.0kwh

Washing machine 1.94kwh 8AM-8PM 0.1kwh 1.0kwh

Clothes dryer 3.4kwh 7PM-7AM 0.25kwh 3.0kwh

PHEV 9.9kwh 8PM-7AM 0.3kwh 2.0kwh

Table 3.2: Non-shiftable Appliances’ parameters for each home

Appliance Name Es Hs γmins γmaxs

Refrigerator-freezer 1.32kwh 8AM-8AM 0kwh 0.055kwh

Oven 2.4kwh 6PM-8PM 0kwh 1.2kwh

Table 3.3: Parameters of curtailable appliances for each sensitive user

Appliance Name Hc γminc γmaxc

Space heater 9PM-12AM 0kwh 1.0kwh

Central air conditioner 6PM-11PM 0kwh 5.0kwh

Table 3.4: Parameters of curtailable appliances for each mid-sensitive user

Appliance Name Hc γminc γmaxc

Space heater 9PM-12AM 0.5kwh 1.0kwh

Central air conditioner 6PM-11PM 2.5kwh 5.0kwh

Table 3.5: Parameters of curtailable appliances for each insensitive user

Appliance Name Hc γminc γmaxc

Space heater 9PM-12AM 1.0kwh 1.0kwh

Central air conditioner 6PM-11PM 5.0kwh 5.0kwh

Page 74: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 74

Table 3.6: Typical consumption level for each category of appliances

Appliance Type Typical consumption level

Shiftable appliances γmaxs

Non-shiftable appliances γmaxs

Curtailable appliances γmaxc

3.5.1 Benefits to Customers and Energy Retailer Based

on Public Dynamic Electricity Prices

We use the actual day-ahead tariffs adopted by the Illinois Power Company (IPC)

from March 1, 2012 to March 31, 2012, which is available to the public online

at [Ser12]. The prices are set day-ahead by the hourly wholesale electricity market

run by the Midwest Independent System Operator (MISO). We assume that

each user is deployed with a smart meter, which has the capabilities of two-way

communication. As aforementioned, with the purpose to design the benchmark,

we assume that, without our proposed optimal scheduling scheme, the appliances

start operation right at the beginning of the time interval Ha and at its typical

power level. It is worth mentioning that the dynamic energy prices used in this

subsection and the following subsection are hourly energy prices and different for

each day while the parameters for setting up the energy scheduling simulations

shown as in Tables 3.1 3.2 3.3 3.4 3.5 3.6 are same for each day.

We begin to compare the payment bills as well as the peak-to-average ratio

(PAR)(see the definition below) in the residential load when using our proposed

optimal scheduling scheme with the benchmark. Clearly, the user is interested

in reducing its payment while the utility company(retailer) is more interested in

having a more balanced load demand with a lower PAR [MRWJ+10b].

Peak-to-average ratio (PAR) Define the daily load for user n as ln = [l1n, ..., lHn ],

where H = 24. Thus, the total load across all users at each hour of the day h ∈ Hcan be calculated as [MRWJ+10b]:

Lh ,∑n∈N

lhn (3.34)

Page 75: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 75

Then the daily peak and average load are calculated as:

Lpeak = maxh∈H

Lh (3.35)

Lavg =1

H

∑h∈H

Lh (3.36)

Thus, the PAR can be defined as:

PAR =LpeakLavg

=H maxh∈H Lh∑

h∈H Lh(3.37)

Simulations Without Curtailable Appliances

From Figure 3.2 and 3.3, we can see the trends of daily payments and PAR

of one residential load based on day-ahead prices adopted by the Illinois Power

Company (IPC) from March 1, 2012 to March 31, 2012. With deployment of

our proposed optimal scheduling scheme, the user’s daily payment is reduced by

36.77% from 50.29 cents to 31.80 cents, which means it can reduce the monthly

payment from $15.59 to only $11.40. Besides this, the average PAR in daily load

decreases by 10.92% from 8.15 to 7.26. As we only consider shiftable and non-

shiftable appliances in this simulation, the total energy usage of this customer on

each day under both cases (with scheduling and without scheduling framework)

are the same (20.76 kwh), which can be simply calculated based on Tables 3.1

and 3.2.

Simulations With Curtailable Appliances

We will carry out the simulations for three kinds of users: sensitive users, mid-

sensitive users and insensitive users respectively. The Figures 3.4, 3.5, 3.6, 3.7,

3.8 and 3.9 show the daily payments and PAR trends for one residential house

based on the day-ahead pricing over a month. The results shown in Tables 3.7

and 3.8 suggest that the deployment of our proposed optimal scheduling scheme

is beneficial for not only the customers but also for the utility company. Having

a closer look at the results, we can easily find from Table 3.7 that price sensitive

customers will make more savings (57.77 cents per day) when deployed with our

real-time consumption scheduling framework whereas the insensitive customers

will save less (only 18.50 cents per day). However, when looking at Table 3.8, we

can find that the average PAR in daily load for insensitive customers is reduced

Page 76: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 76

by 36.17% from 5.28 to 3.37 whereas that of sensitive customers decreases by only

17.61% from 5.28 to 4.35.

1/03/2012 16/03/2012 31/03/20120

10

20

30

40

50

60

70

80

Day

Dai

ly E

lect

ricity

Pay

men

t(ce

nts)

with schedulingwithout scheduling

Figure 3.2: Payment bills without curtailable appliances over one month

1/03/2012 16/03/2012 31/03/20126.8

7

7.2

7.4

7.6

7.8

8

8.2

8.4

Day

PA

R −

Sin

gle

Res

iden

tial L

oad

with schedulingwithout scheduling

Figure 3.3: PAR without curtailable appliances over one month

Page 77: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 77

1/03/2012 16/03/2012 31/03/20120

20

40

60

80

100

120

140

160

180

200

Day

Dai

ly E

lect

ricity

Pay

men

t(ce

nts)

with schedulingwithout scheduling

Figure 3.4: Payment with curtailable appliances over one month – sensitive user

1/03/2012 16/03/2012 31/03/20123

3.5

4

4.5

5

5.5

6

6.5

Day

PA

R −

Sin

gle

Res

iden

tial L

oad

with schedulingwithout scheduling

Figure 3.5: PAR with curtailable appliances over one month – sensitive user

Page 78: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 78

1/03/2012 16/03/2012 31/03/201240

60

80

100

120

140

160

180

200

Day

Dai

ly E

lect

ricity

Pay

men

t(ce

nts)

with schedulingwithout scheduling

Figure 3.6: Payment with curtailable appliances over one month – mid-sensitiveuser

1/03/2012 16/03/2012 31/03/20122.5

3

3.5

4

4.5

5

5.5

Day

PA

R −

Sin

gle

Res

iden

tial L

oad

with schedulingwithout scheduling

Figure 3.7: PAR with curtailable appliances over one month – mid-sensitive user

Page 79: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 79

1/03/2012 16/03/2012 31/03/201260

80

100

120

140

160

180

200

Day

Dai

ly E

lect

ricity

Pay

men

t(ce

nts)

with schedulingwithout scheduling

Figure 3.8: Payment with curtailable appliances over one month – insensitiveuser

1/03/2012 16/03/2012 31/03/20123

3.5

4

4.5

5

5.5

Day

PA

R −

Sin

gle

Res

iden

tial L

oad

with schedulingwithout scheduling

Figure 3.9: PAR with curtailable appliances over one month – insensitive user

Page 80: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 80

Table 3.7: Average Daily Bill comparison of each type of user over one month

User type With scheduling Without scheduling

Sensitive 71.92 cents 129.69 cents

Mid-sensitive 91.56 cents 129.69 cents

Insensitive 111.19 cents 129.69 cents

Table 3.8: Average PAR in daily load for each type of user over one month

User type With scheduling Without scheduling

Sensitive 4.35 5.28

Mid-sensitive 3.63 5.28

Insensitive 3.37 5.28

Table 3.9: Combinations of users for case study

Case number Sensitive user Mid-sensitive user insensitive user

1 200 700 1002 450 450 1003 700 200 100

3.5.2 Benefits to Customers and Energy Retailer Based

on Our Proposed Optimal Smart Pricing Scheme

We simulate one utility company serves 1000 customers consisting of sensitive

users, mid-sensitive users and insensitive users and each customer has 8 appli-

ances.

The aim of our proposed optimal smart pricing scheme is to find the optimal 24

hours day-ahead prices by maximizing the retailer’s profit (leader level problem).

Besides this, with this identified price information, the customers can achieve their

best benefit, i.e. minimize their payment bills with deployment of our proposed

optimal scheduling scheme (follower level problem).

Getting ideas from the time-of-use pricing (ToU), we divide the 24 hours prices

into three levels, i.e. on-peak hours (5PM-12AM), mid-peak hours (8AM-5PM)

and off-peak hours (12AM-8AM).

We will carry out 3 test cases, whose details can be found in Table 3.9.

Page 81: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 81

Table 3.10: Bill comparison of each type of user under Case 1

User type With scheduling under optimized prices Benchmark

Sensitive 270.81 cents 717.83 cents

Mid-sensitive 467.58 cents 717.83 cents

Insensitive 664.35 cents 717.83 cents

Table 3.11: Bill comparison of each type of user under Case 2

User type With scheduling under optimized prices Benchmark

Sensitive 258.53 cents 665.74 cents

Mid-sensitive 439.91 cents 665.74 cents

Insensitive 621.30 cents 665.74 cents

Table 3.12: Bill comparison of each type of user under Case 3

User type With scheduling under optimized prices Benchmark

Sensitive 205.25 cents 655.12 cents

Mid-sensitive 409.10 cents 655.12 cents

Insensitive 612.95 cents 655.12 cents

Result Analysis

From Figure 3.10, 3.11, 3.12, we can see the energy consumption patterns with

different type of users under different case studies. Note that the valid scheduling

time for curtailable appliances starts from 6PM to 12AM in the simulation. Take

Case 1 for example, the sensitive user consumed the least energy during the

period 6PM-12AM while the insensitive user consume the most energy in the

same period.

From Table 3.10, 3.11, 3.12, we can see the daily payment of different types

of users. The sensitive users pay less compared with insensitive users.

Another thing which should be highlighted is under all the case studies the

energy consumption for each type of users is shifted from higher price periods

Page 82: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 82

to lower price periods by the deployment of our optimal smart pricing scheme,

which leads to a lower PAR and is beneficial to the retailer.

3.6 Chapter Summary

In this chapter, we propose a Stackelberg game to model the interactions between

the retailer and its customers by utilizing the benefits of smart grids. Firstly, ac-

cording to the load types of home appliances, we divide them into shiftable,

non-shiftable and curtailable appliances. For different category of appliances,

different appliance-level optimization models have been proposed. Secondly, a

profit maximization problem for the retailer has been modelled and a KKT con-

dition based approach is proposed to solve the problem. As the simulation results

show that both the retailer and its customers can benefit from the proposed game

framework, it has great potential to improve the implementation of current en-

ergy pricing programs, help customers to reduce the increasing energy bills, and

change customers’ energy usage patterns.

Page 83: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 83

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM0

5

10

kwh

hrs

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM5

10

15

cent

s

with schedulingwithout schedulingelectricity price

(a) Sensitive user

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM0

5

10

kwh

hrs

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM5

10

15

cent

s

with schedulingwithout schedulingelectricity price

(b) Mid-sensitive user

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM0

5

10

kwh

hrs

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM5

10

15

cent

s

with schedulingwithout schedulingelectricity price

(c) Insensitive user

Figure 3.10: Energy consumption of different users under Case 1

Page 84: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 84

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM0

5

10

kwh

hrs

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM5

10

15

cent

s

with schedulingwithout schedulingelectricity price

(a) Sensitive user

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM0

5

10

kwh

hrs

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM5

10

15

cent

s

with schedulingwithout schedulingelectricity price

(b) Mid-sensitive user

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM0

5

10

kwh

hrs

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM5

10

15

cent

s

with schedulingwithout schedulingelectricity price

(c) Insensitive user

Figure 3.11: Energy consumption of different users under Case 2

Page 85: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 3. SMART PRICING TO DEMAND RESPONSE I 85

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM0

1

2

3

4

5

6

7

8

9

10

kwh

hrs

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM7.5

8

8.5

9

9.5

10

10.5

11

11.5

12

12.5

cent

s

with schedulingwithout schedulingelectricity price

(a) Sensitive user

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM0

1

2

3

4

5

6

7

8

9

10

kwh

hrs

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM7.5

8

8.5

9

9.5

10

10.5

11

11.5

12

12.5

cent

s

with schedulingwithout schedulingelectricity price

(b) Mid-Sensitive user

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM0

1

2

3

4

5

6

7

8

9

10

kwh

hrs

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM7.5

8

8.5

9

9.5

10

10.5

11

11.5

12

12.5

cent

s

with schedulingwithout schedulingelectricity price

(c) Sensitive user

Figure 3.12: Energy consumption of different users under Case 3

Page 86: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Chapter 4

Smart Pricing to Demand

Response Management II – A

Bilevel Optimization Based

Approach

4.1 Introduction

In Chapter 3, we adopt a Stackelberg game model to represent the interactions

between the energy retailer and its customers and a KKT condition based ap-

proach is proposed to solve the small-scale demand response problems faced by

the retailer. In this chapter, we use an alternative model, i.e. bilevel optimiza-

tion to model the smart pricing based demand response management problems.

Compared with the problem formulations in Chapter 3, in this chapter, we con-

sider a more comprehensive and complete home energy management model at

the customer-side and a more applicable smart pricing model at the retailer-side.

More specifically, we consider most commonly used types of home appliances

(e.g. interruptible appliances, non-interruptible appliances and curtailable appli-

ances), most possible types of applications for curtailable appliances (e.g. maxi-

mize customers’ lift comforts and minimize customers’ bills) and an easy-to-use

waiting time cost model for interruptible and non-interruptible appliances for the

customer-side problem. Such a comprehensive home energy management model,

which is missing in the existing literature, can be seen as a great improvement

over previous chapter. Further, we propose a more applicable smart pricing model

86

Page 87: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 87

at the retailer-side by adding a total revenue constraint for the retailer to ensure

that the customers are offered a sufficient number of low price periods. Although

Stackelberg game and bilevel optimization have many similar properties, in this

thesis we separate them into two chapters (Chapter 3 and Chapter 4) for two

reasons. One the one hand, as aforementioned, Stackelberg game is more com-

monly researched from the application perspective, i.e. model the interactions

and game behaviours between different players while bilevel optimization is more

widely known from the optimization perspective, i.e. design proper solution al-

gorithms to solve the problem as what we are doing in this chapter. On the other

hand, Chapter 3 can be seen as our first attempt to solve the problem in a small

scale from the application perspective and Chapter 4 is a significantly extension

over the work in Chapter 3 and more from the optimization perspective in or-

der to solve the problem in a large scale. Recall that the solution approach to

Stackelberg game proposed in previous chapter is KKT condition based, it can

not be used in large-scale problems and might pose privacy issues to customers.

To overcome this, in this chapter, we propose distributed solution approaches

with multi-population genetic algorithms for the retailer-side problem and dis-

tributed individual algorithms for each customer-side problem, which are more

adaptable and usable for large-scale problems. This chapter is adapted from our

two papers [MZ14] [MZss].

The rest of this chapter is organized as follows. Firstly, the preliminary knowl-

edge of genetic algorithms, which will be used to solve the bilevel optimization

problem, is given in Section 4.2. Secondly, the problem statement is given in Sec-

tion 4.3. Thirdly, the bilevel model problem formulation for the demand response

problem is presented in Section 4.4. More specifically, the customer-side problem

at the lower-level is given in subsection 4.4.1 and the retailer-side problem at the

upper-level is given in subsection 4.4.2. Fourthly, Section 4.5 gives the proof of

existence of optimal solutions to the bilevel model, the solution approach to the

lower-level problem for customers and the multi-population GA based solution

approach to the upper-level problem faced by the retailer. Finally, the numeri-

cal results that show the benefits of the proposed smart pricing approach to the

retailer and its customers are given in Section 4.6. This chapter is concluded in

Section 4.7.

Page 88: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 88

Figure 4.1: Flowchart of a Typical Genetic Algorithm

4.2 Preliminary Knowledge – Genetic Algorithms

Genetic algorithms (GAs) were proposed by Holland in 1975 from Darwin’s theory

of evolution. GAs can impose a series of genetic operations such as selection,

crossover and mutation on the current population and gradually evolve to the

optimal solution [RMG00]. Figure 4.1 shows the flowchart of a typical genetic

algorithm.

4.2.1 Representation

The binary-coded representation is adopted in this thesis as it is the most nature

choice and suitable to our considered problem.

Assume the decision variables for our proposed smart pricing problem are ph

where ph is the energy price that the retailer can offer to the customers at hour

h. For each h ∈ 1, 2, ..., 24, we have pminh ≤ ph ≤ pmaxh , where pminh is the hourly

minimum price that the retailer can offer and pmaxh is the hourly maximum price

that the retailer can offer. For example, we set 8.00 pence ≤ ph ≤ 14.00 pence.

As we know that the length of the binary bits for the chromosome is related to the

Page 89: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 89

Algorithm 1 Tournament Selection

Input:The population P1, P2, ..., PN and the tournament size k = 2;

Output:The population after selection P ′1, P ′2, ..., P ′N;

1: for i=1 to N do2: P ′i ← best fit individual out of k randomly picked individuals from

P1, P2, ..., PN;3: end for4: return P ′1, P ′2, ..., P ′N

value and precisions of the decision variables, we assume the precision requirement

d for each decision variable (each hourly price ph) is two places after decimal (two

decimal places after the small unit in a given currency), i.e. d = 10−2 and the

length of binary bits is lh. To map the real variable ph into a corresponding binary

form [BN13], we have

2lh−1 ≤ (pmax − pmin)× 1

d≤ 2lh (4.1)

Thus, for the problem, we have:

2lh−1 ≤ 600 ≤ 2lh

and

29 ≤ 600 ≤ 210

Hence, the required bit length for each variable is lh = 10.

Notice that for our problem to be well encoded so as to handle the required

precision of 2-decimal places, the use of 10-bits in the binary representation is

crucial.

4.2.2 Tournament Selection and Elitism

Binary deterministic tournament selection is adopted in this thesis. In this selec-

tion, two individuals are chosen at random and the better of the two individuals

is selected into the mating pool. The process of tournament selection is described

in Algorithm 1 [BT96].

To make sure that the best individuals always survive to the next generation,

we adopt elitism in the selection process. In elitism, the best chromosome is

Page 90: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 90

copied to the next generation without taking the crossover and mutation. Since

the elitism stores the best chromosome obtained till the current generation, it

guarantees the reproduction of best chromosome during the evolutionary search

procedure. As a consequence, it increases the convergence of the optimization

process as well as the robustness of the algorithm.

A typical genetic algorithm with elitism can be described as follows:

Step 1: Generate an initial population P randomly;

Step 2: Evaluate population P and find the best chromosome Cbest;

Step 3: Perform Selection, Crossover and Mutation operations on the popu-

lation P to get a new population ;

Step 4: Find the best chromosome C′

best and worst chromosome C′worst of the

new Population. If C′

best < Cbest, let C′worst = Cbest. Otherwise, no replacement

takes place.

Step 5: Go to step 2.

4.2.3 Uniform Crossover

Algorithm 2 [Sys89] describes how the uniform crossover works. Usually, we set

the probability of swapping to 0.5.

Algorithm 2 Uniform Crossover

Input:Two given parental chromosomes P1 and P2;

Output:Two offspring P ′1 and P ′2 after uniform crossover;

1: for i=1 to L do2: // i represents the gene position of the chromosome and L is the length of

the chromosome.3: choose a random real number m from interval [0, 1];4: if m ≤ pc then5: // pc is the probability of swapping.6: P ′1(i) = P2(i)7: P ′2(i) = P1(i)8: else9: P ′1(i) = P1(i)10: P ′2(i) = P2(i)11: end if12: end for13: return P ′1 , P ′2

Page 91: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 91

4.2.4 Mutation

Bit flip, which simply inverts the value of the chosen gene, is adopted in this

thesis. Note that this kind of mutation can only be used for binary genes.

4.2.5 Constraint Handling

In general, a constrained optimization problem is defined as:

min f(~x)

s.t.

gj(~x) ≥ 0, j = 1, ..., J,

hk(~x) = 0, k = 1, ..., K,

xli ≤ xi ≤ xui , i = 1, ..., n.

(4.2)

In the above constrained optimization problem, there are n variables (i.e. ~x

is a vector of size n), J inequality constraints, and K equality constraints. The

function f(~x) is the objective function, gj(~x) is the j-th inequality constraint,

and hk(~x) is the k-th equality constraint. The i-th variable varies in the range

[xli, xui ].

There are many methods for constrained optimization with genetic algorithms,

where penalty function method is the most widely used. In the penalty function

method for handling inequality constraints in minimization problems, the fitness

function F (~x) is defined as the sum of the objective function f(~x) and a penalty

term which depends on the constraint violation P (gj(~x)):

F (~x) = f(~x) +J∑j=1

Rj × P (gj(~x))2, (4.3)

where Rj are non-negative penalty parameters and P (gj(~x)) = min0, gj(~x).As for the equality constraints hk(~x) = 0, we can convert them into two

inequality constraints and then handle the resulting inequality constraints in the

above mentioned penalty function method:

hk(~x) ≤ θ and hk(~x) ≥ −θ, k = 1, ..., K, (4.4)

where θ is a small positive value.

According to [Deb00], the above penalty method has two problems:

Page 92: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 92

• The optimal solution of F (~x) depends on penalty parameters Rj, which

needs extensive experimentation to find out which penalty parameters are suitable

to the considered problems.

• The inclusion of the penalty term distorts the objective function, which may

result in finding a local optimum or even finding no optimums at all.

To avoid the above problems, [Deb00] proposes a new penalty method for GAs

which eliminate the use of penalty parameters. Such a penalty method is defined

as Eq.(4.5). Further, the equality constraints can be handled in the same way as

Eq.(4.4).

F (~x) =

f(~x) if gj(~x) ≥ 0, j = 1, ..., J,

fmax +∑J

j=1 P (gj(~x)) othewise,(4.5)

where fmax is the objective function value of the worst feasible solution in the

population. As a result, the fitness of an infeasible solution not only depends

on the amount of constraint violation, but also on the population of solutions at

hand. Numerical results indicate the efficiency of this constraint handling method

for genetic algorithms.

4.3 Problem Statement

In this section, we will describe how the optimal day-ahead pricing for demand

response management can be modelled as a bilevel optimization problem.

It is assumed that each customer is equipped with a smart meter. The in-

teractions between the retailer and its customers are enabled through a two way

communication infrastructure shown as Figure 4.2.

The decision processes of the retailer and the customers are as follows: the

utility company (retailer) acting as the upper level decision agent firstly deter-

mines the selling price offered to its customers with the aim to maximize its

profit. To solve this profit maximization problem, it is assumed that each cus-

tomer (lower level decision agent) optimally reacts to the announced retail price,

i.e. each customer (smart meter) determines the optimal energy consumption

with aim to maximize its benefits such as minimize its bills and/or maximize its

life quality. As a result, the retailer’s price determining problem can be modelled

Page 93: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 93

Figure 4.2: Bilevel programming model structure

as a bilevel problem, i.e. a decision-making problem involving two types of agents

who try to optimize their own objective functions over a jointly dependent set.

Note that each customer’s decision about electricity usages is independent from

other customers’ decisions.

Recall that the general formulation of a bi-level optimization problem with

one upper level decision agent and N independent lower level decision agents can

be represented as follows [SMD12] [CG07]:

maxx,y1,...,yN

F (x, y1, ..., yN)

such that

yi ∈ argminy′i

fi(x, y′i) : gi(x, y′i) ≤ 0,

i = 1, . . . , N G(x, y1, ..., yN) ≤ 0

x ∈ X, yi ∈ Yi

(4.6)

In the above formulation, F represents the upper-level objective function and

fi (i = 1, 2, ..., N) represent the lower-level objective functions. Similarly, x is the

decision vector of the upper level agent and yi is the decision vector of the i-th

lower level agent. G represent the constraint functions at the upper level and

gi represent the constraint functions of i-th lower level agent at the lower level.

Page 94: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 94

X are the bound constraints for the upper level decision vector and Yi are the

bound constraints for each lower level decision vector.

A solution (x∗, y∗1, ..., y∗N) which maximizes the above objective function F (x, y1,

..., yN) subject to all the constraints is said to be a bilevel optimal solution.

4.4 Bilevel Problem Formulation

In this section, the mathematical representation of the considered two level de-

cision making problems is provided. Firstly, our focus is to formulate the energy

management problem in response to the day-ahead prices in each household at the

lower level. Secondly, we model the profit maximization problem for the retailer

at the upper level who will offer hourly prices of next 24 hours to its customers.

Throughout this chapter, let N , 1, 2, ..., N denote the considered set of

customers with N , |N |, and H , 1, 2, ..., H where H denotes the scheduling

horizon. Usually, H = 24.

We define the prices offered by the retailer as a price vector: P = [p1, ..., ph, ..., pH ],

where ph represents the electricity price at hour h.

4.4.1 Customer-side Problem at the Lower Level

We categorize the home appliances into non-shiftable appliances (e.g. lights),

interruptible appliances (e.g. PHEVs), non-interruptible appliances (e.g. washing

machines, dish washers) and curtailable appliances (e.g. air conditioning, heating)

according to their load types.

Since the non-shiftable appliances consume a fixed amount of energy per hour

during a fixed working period, there is no flexibility to adjust their energy con-

sumption in response to the prices.

For interruptible appliances, the operations of these appliances can be inter-

rupted, i.e. it is possible to charge the PHEV for one hour, then stop charging

for one or several hours and then complete the charging after that.

For non-interruptible appliances, the operations of these appliances are non-

interruptible, i.e. since such appliances start, they have to keep running till the

completion.

For curtailable appliances, the total energy consumption can be adjusted. For

example, if a customer feels the price in a given hour is too high, he can reduce

the usage of these appliances or even stop them.

Page 95: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 95

Both the interruptible appliances and the non-interruptible appliances can

be seen as shiftable appliances whose operation periods could be postponed or

shifted from a high price period to a low price period. However, the curtailable

appliances cannot be shifted or postponed. As a result, the proposed waiting time

cost model only applies to the interruptible and non-interruptible appliances.

In the following, we will firstly give the mathematical models for different

types of appliances. Furthermore, we propose a financial-incentive based waiting

cost model for interruptible and non-interruptible appliances.

For each customer n ∈ N , we denote the set of all appliances in the house-

hold as An, non-shiftable appliances as NSn, interruptible appliances as In, non-

interruptible appliances as NIn and curtailable appliances as Cn.

Interruptible Appliances

For each interruptible appliance a ∈ In, a scheduling vector of energy consumption

over the scheduling horizon H = 1, 2, ..., H is defined as follows:

xn,a = [x1n,a, ..., x

hn,a, ..., x

Hn,a]

where xhn,a ≥ 0 represents the n-th customer’s electricity consumption of appliance

a at time h.

Furthermore, the scheduling window for each appliance a can be set by each

customer according to his/her preference and is defined as Hn,a , αn,a, αn,a +

1, ..., βn,a. Since the window Hn,a is consecutive, one only needs to specify the

beginning scheduling time αn,a ∈ H and end time βn,a ∈ H.

The model of the payment minimization problem for each interruptible appli-

ance is given as follows:

min JIn(a)(αn,a : βn,a) = minxhn,a

βn,a∑h=αn,a

ph × xhn,a (4.7)

s.t.

βn,a∑h=αn,a

xhn,a = En,a, (4.8)

γminn,a ≤ xhn,a ≤ γmaxn,a ,∀h ∈ Hn,a. (4.9)

From the above optimization model, we can see that the energy consumption

Page 96: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 96

at some hours will be at minimum energy consumption level/standby power level

(γminn,a ). These standby power levels actually stand for the interruptions during

the scheduling window. Rather than explicitly consider the start/end time of

each interruption into the optimization problem which is very difficult, the above

proposed optimization model accounts for the interruptions from the energy con-

sumption perspective.

Constraint (4.8) represents that, for each appliance a, the total energy con-

sumption to accomplish the operations within the scheduling window is fixed,

which is denoted as En,a. Constraint (4.9) represents that there is a minimum

power level and a maximum power level for each appliance a within the scheduling

window.

Non-interruptible Appliances

As the operations of each non-interruptible appliance a ∈ NIn are consecutive,

we define the length of the operations Ln,a. The customers can set the scheduling

window Hn,a , αn,a, αn,a + 1, ..., βn,a by specifying the beginning scheduling

time and end time.

The optimization problem is to find each appliance’s optimal start time s∗n,a

to minimize the customer’s payment.

As a result, the model of the payment minimization problem for a non-

interruptible appliance is given as follows:

min JNIn(a)(αn,a : βn,a) = minsn,a,xhn,a

sn,a+Ln,a∑h=sn,a

ph × xhn,a (4.10)

s.t.

sn,a ∈ αn,a, αn,a + 1, ..., βn,a − Ln,a (4.11)

sn,a+Ln,a∑h=sn,a

xhn,a = En,a, (4.12)

γminn,a ≤ xhn,a ≤ γmaxn,a , ∀h ∈ Hn,a. (4.13)

Constraint (4.11) indicates that the start time is a discrete variable belonging

to the set αn,a, αn,a+1, ..., βn,a−Ln,a. Constraint (4.12) represents that the total

energy consumption to accomplish the consecutive operations is fixed at En,a.

Constraint (4.13) shows that there is a minimum power level and a maximum

Page 97: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 97

power level for each appliance a within the scheduling window.

Waiting Time Cost Model for Interruptible and Non-interruptible Ap-

pliances

We propose a financial-incentive based waiting cost model that is straightfor-

ward and easy to use for interruptible and non-interruptible appliances. In real

applications, firstly the customers need to set the financial thresholds that trig-

ger different waiting lengths. For example, customers can input their financial

thresholds via the interface between a laptop or mobile phone and the home

energy management software integrated in a smart meter. Additionally such

home energy management software can provide guidance and support in help-

ing customers determine their financial thresholds by providing a questionnaire

to customers. With the relevant information available, the customers can set

the financial thresholds themselves easily. Secondly, by using our proposed wait-

ing cost model, it will determine the optimal waiting length for each appliance.

Example 1 is used to help describe the proposed waiting time scheme.

Example 1. We assume that the original scheduling window for PHEV is

[7PM - 11PM] and the maximum waiting time length is 3 hours. Furthermore,

the financial thresholds to trigger the waiting are 10 pence for 1 hour, 25 pence

for 2 hours and 45 pence for 3 hours. Assume that the energy bills saved by

different waiting hours are given in the table below:

Table 4.1: Energy bills saved by different waiting length

Waiting Length Scheduling Window Financial Threshold Saved Bill0 hour [7PM - 11PM] - -1 hour [7PM - 12AM] 10 pence 12 pence2 hours [7PM - 1AM] 25 pence 30 pence3 hours [7PM - 2AM] 45 pence 40 pence

From the above table, we can see that by waiting for 1 hour, it can save the

bill by 12 pence, which is higher than the financial threshold (10 pence). As a

result, the waiting time cost model will treat the current waiting length (1 hour)

as a potential solution and then check the next waiting length (2 hours). Due to

the same reason as before, the waiting length of 2 hours can also be treated as a

potential solution. However, by waiting for 3 hours, it can only save the customer

by 40 pence on the bill which is lower than the financial threshold (45 pence).

Page 98: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 98

As a result, the waiting length of 3 hours cannot be regarded as a potential

solution. The above process is repeated until all the waiting lengths are checked.

The optimal waiting time is one potential solution that achieves the maximal bill

saving for the customer. In the above example, the optimal waiting time for the

PHEV is 2 hours.

Based on the above analysis and illustration, the mathematical representations

of the waiting time cost model are given below. To avoid repetition, we only deal

with the waiting time cost model for interruptible appliances. However, the model

also applies to non-interruptible appliances.

For each interruptible appliance a ∈ In, we assume that the maximum waiting

time is denoted as Kn,a ≥ 1, which can be set by customers according to their

preferences in advance.

Without any waiting time, the minimized energy bill model for each inter-

ruptible appliance is denoted as follows:

min JIn(a)(αn,a : βn,a) = minxhn,a

βn,a∑h=αn,a

ph × xhn,a (4.14)

With the waiting time of kn,a ∈ 0, ..., Kn,a, the minimized energy bill model

is defined as:

min JIn(a)(αn,a : βn,a + kn,a) = minxhn,a

βn,a+kn,a∑h=αn,a

ph × xhn,a, (4.15)

where Hn,a = αn,a, αn,a+1, ..., βn,a is extended to αn,a, αn,a+1, ..., βn,a+kn,a.We define the Waiting Time Benefit Function, i.e. the energy bill saved by

waiting kn,a hours, as follows:

4JIn(a)(kn,a) = min JIn(a)(αn,a : βn,a)−min JIn(a)(αn,a : βn,a + kn,a), kn,a = 0, 1, ..., Kn,a

(4.16)

Page 99: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 99

Furthermore, we define the Benefit Threshold Function as follows:

4JIn(a) =

4JIn(a)(1) if 4JIn(a)(1) ≥ C1n,a

4JIn(a)(2) if 4JIn(a)(2) ≥ C2n,a

.. ..

4JIn(a)(Kn,a) if 4JIn(a)(Kn,a) ≥ CKn,a

0 if none of above satisfies

(4.17)

where the financial thresholds C1n,a , C2n,a , ..., CKn,a are set by the customers as

described above.

As a result, the optimal waiting time of interruptible appliance a for customer

n can be obtained by solving the following optimization problem:

maxkn,a

4JIn(a) (4.18)

Similarly, the waiting time cost model for non-interruptible appliances can be

represented as the following optimization problem:

maxkn,a

4JNIn(a) (4.19)

Curtailable Appliances

Similarly to the interruptible and non-interruptible appliances, the customers can

set the valid scheduling window Hn,a , αn,a, αn,a + 1, ..., βn,a. However, com-

pared with interruptible and non-interruptible appliances, the scheduling window

of curtailable appliances should be more strict and accurate because the appli-

ances will be ‘on’ for the whole window.

For some customers, in particular more price sensitive ones, they prefer reduc-

ing the spending as much as possible subject to an acceptable comfortable level

(for example, the air conditioning is turned on at least half an hour with each

hour between 7:00pm and 10:00pm). For the other (and more) customers, they

prefer a budget based comfortable maximization model in the sense that, for a

given curtailable appliance such as air conditioning, they set up a daily budget

(i.e. the maximum allowed daily spending) under which they schedule the energy

consumption to make their life as comfortable as possible.

Page 100: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 100

To meet the above two types of customers’ preferences, two types of optimiza-

tion model for curtailable appliances are proposed as below and a customer can

choose one of them dependent on his/her preference.

Minimize Bill Subject to an Acceptable Energy Consumption This op-

timization scheme targets price sensitive customers. The proposed optimization

model is given as follows:

minJ1Cn(a)(αn,a : βn,a) = minxhn,a

βn,a∑h=αn,a

ph × xhn,a (4.20)

s.t.

uhn,a ≤ xhn,a ≤ uhn,a, (4.21)

βn,a∑h=αn,a

xhn,a ≥ Uminn,a . (4.22)

Constraint (4.21) shows that the energy consumption at each hour is within

the minimum acceptable consumption level uhn,a and maximum affordable con-

sumption level uhn,a, which can be set according to each individual customer’s

preferences. Constraint (4.22) indicates that the electricity consumed during the

operation period should not be less than a minimum acceptable consumption

level, i.e. there exists an energy consumption constraint for each curtailable ap-

pliance.

Maximize Energy Consumption Subject to an Affordable Financial

Constraint This optimization scheme aims at the less price sensitive customers

who prefer a budget based energy consumption maximization model. The pro-

posed optimization model is given as follows:

maxJ2Cn(a)(αn,a : βn,a) = maxxhn,a

βn,a∑h=αn,a

xhn,a (4.23)

s.t.

uhn,a ≤ xhn,a ≤ uhn,a, (4.24)

βn,a∑h=αn,a

ph × xhn,a ≤ Cmaxn,a . (4.25)

Page 101: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 101

Constraint (4.24) is same as constraint (4.21). Constraint (4.25) indicates

that for each curtailable appliance, the money spent during the operation period

should not exceed the given budget, i.e. there exists a financial cap for each

curtailable appliance.

Since there are two types of optimization models for curtailable appliances,

the optimization problem for customer n including all the above considered types

of appliances has two different optimization objectives shown as Eqs.(4.26) and

(4.27). The customers can choose one of them depending on their preferences.

minJ1n = min∑a∈In

(JIn(a) −4JIn(a)) +∑

a∈NIn(JNIn(a) −4JNIn(a)) +

∑a∈Cn

J1Cn(a)

subject to constraints (4.8)–(4.9), (4.11)–(4.13), and (4.21)–(4.22).

(4.26)

minJ2n = min∑a∈In

(JIn(a) −4JIn(a)) +∑

a∈NIn(JNIn(a) −4JNIn(a))−

∑a∈Cn

J2Cn(a)

subject to constraints (4.8)–(4.9), (4.11)–(4.13), and (4.24)–(4.25).

(4.27)

4.4.2 Retailer-side Problem at the Upper Level

In this subsection, we model the profit of the retailer by using the revenue sub-

tracting the energy cost imposed on the retailer.

We define a cost function Ch(Lh) indicating the cost of the retailer providing

electricity at each hour h ∈ H, where Lh represents the amount of power provided

to all customers at each hour of the day. We assume that the cost function Ch(Lh)

is convex increasing in Lh for each h [MRWJ+10b] [LCL11]. In view of this, the

cost function is designed as follows [MRWJ+10b].

Ch(Lh) = ahL2h + bhLh + ch (4.28)

where ah > 0 and bh ≥ 0, ch ≥ 0 at each hour h ∈ H.

Page 102: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 102

As a result, the profit maximization model is given as follows:

maxph∑h∈H

ph ×∑n∈N

∑a∈An

xhn,a −∑h∈H

Ch(∑n∈N

∑a∈An

xhn,a) (4.29)

s.t.

pminh ≤ ph ≤ pmaxh , (4.30)∑n∈N

∑a∈An

xhn,a ≤ Emaxh ,∀h ∈ H, (4.31)∑

h∈H

ph ×∑n∈N

∑a∈An

xhn,a ≤ Rmax. (4.32)

Constraint (4.30) represents that the prices the retailer can offer are greater

than a minimum price, for example, the wholesale price at each hour, and less

than a maximum price, for example, the price cap of the retail price due to retail

market competition and regulation. Constraint (4.31) indicates that there usually

exists a maximum supply capacity by the retailer or a maximum load capacity

of power networks. Due to the in-elasticity of energy use, we add the revenue

constraint (4.32) to improve the acceptability of the retailer’s pricing strategies,

i.e. there exists a total revenue cap, denoted as Rmax, for the retailer. Without

such a constraint, the retail prices will keep going up to a level which is politically

against the government, political parties, and energy regulators and financially

unacceptable by customers.

4.5 Bilevel Model Solutions

In this thesis, we propose a hybrid optimization approach to solve the bilevel

optimization problem. Our approach determines the energy prices by interacting

with the customers (smart meters) within the framework of genetic algorithms

for the upper level problem and individual optimization algorithm for the lower

level problem.

In this section we will firstly prove the existence of the optimal solution to our

bi-level model, secondly show the solution algorithm to the lower-level problem,

and finally present the solution algorithm to the upper-level problem.

Page 103: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 103

4.5.1 Existence of Optimal Solutions to the Bilevel Model

First, we consider the following bilevel model with one upper level agent and one

lower level agent.

maxx,y1,...,yN

F (x, y1, ..., yN)

subject to

(y1, ...yN) ∈ argminy1,...yN

∑N

i=1 fi(x, yi) :

gi(x, yi) ≤ 0, i = 1, . . . , N G(x, y1, ..., yN) ≤ 0

x ∈ X, (y1, ..., yN) ∈ Y1 × ...× YN

(4.33)

Note that each fi(x, yi) (i = 1, ..., N) in the objective function of the above

lower level problem is independent from each other. Each constraint function

gi(x, yi) (i = 1, ..., N) of the lower level problem is also independent from each

other. Further it is assumed that the above considered bilevel optimization prob-

lem has at least one feasible solution.

Lemma 2. The bilevel model with one upper level agent and N independent lower

level agents (Eq.(4.6)) is equivalent to the bilevel model with one upper level agent

and one lower level agent (Eq.(4.33)).

Proof. Firstly, for any fixed x, if

y∗i ∈ argminyi∈Yi

fi (x, yi) : gi (x, yi) ≤ 0 i = 1, 2, . . . , N,

that is, for all i = 1, 2, . . . , N

fi (x, y∗i ) ≤ f i (x, yi) ∀yi ∈ Yi, gi(x, yi) ≤ 0,

which implies immediately

N∑i=1

fi(x, y∗i ) ≤

N∑i=1

fi(x, yi) ∀yi ∈ Yi, gi (x, yi) ≤ 0,

i = 1, 2, . . . , N.

Page 104: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 104

That is,

(y∗1, . . . , y∗N) ∈ argmin

(y1,...,yN )∈Y1×···×YNN∑i=1

fi(x, yi) : gi(x, yi) ≤ 0,

i = 1, . . . , N.

On the other hand, if the above equation holds, that is

N∑i=1

fi(x, y∗i ) ≤

N∑i=1

fi(x, yi) ∀yi ∈ Yi, gi (x, yi) ≤ 0,

i = 1, 2, . . . , N

For any i0 ∈ 1, 2, . . . , N , by choosing yi = y∗i for i = 1, . . . , i0 − 1, i0 +

1, . . . , N , the above inequality implies

fi0(x, y∗i0

)≤fi0 (x, yi0) ∀yi0 ∈ Yi0 , gi0(x, yi0) ≤ 0.

That is,

y∗i0 ∈ argminyi0∈Yi0

fi0 (x, yi0) : gi0 (x, yi0) ≤ 0

i0 = 1, 2, . . . , N

The above proof shows that

y∗i ∈ argminyi∈Yi

fi (x, yi) : gi (x, yi) ≤ 0 i = 1, 2, . . . , N

if and only if

(y∗1, . . . , y∗N) ∈ argmin

(y1,...,yN )∈Y1×···×YN

N∑i=1

fi(x, yi) :

gi(x, yi) ≤ 0, i = 1, ..., N

Based on the formulations of Eq.(4.6) and Eq.(4.33), this implies that they

have the exactly same objective functions and constraints and therefore these

two bi-level optimization problems are equivalent and have the same optimal

solutions.

Page 105: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 105

Lemma 3. Consider the bilevel model shown as Eq.(4.33), if X is a finite space,

then the optimal solutions to the bilevel model exist.

Proof. Firstly, for a given x ∈ X, denote

Ω(x) = argmin(y1,...,yN )∈Y1×...×YN

∑N

i=1 fi(x, yi) :

gi(x, yi) ≤ 0, i = 1, . . . , N,(4.34)

and choose(y∗1, ..., y

∗N) = argmax

(y1,...,yN )∈Ω(x)

F (x, y1, ..., yN) :

G(x, y1, ..., yN) ≤ 0.(4.35)

Now denote (y∗1, ..., y∗N) = R(x). As X is a finite set, there exists an optimal

solution as follows

x∗ = argmaxx∈X

F [x,R(x)] : G[x,R(x)] ≤ 0 . (4.36)

Now we are going to prove [x∗, R(x∗)] is the optimal solution to the bilevel

problem (4.33).

For any feasible solution (x, y1, ..., yN) to the bilevel problem (4.33), from

(4.34), (4.35), (4.36), we have

F (x, y1, ..., yN) ≤ F [x,R(x)] ≤ F [x∗, R(x∗)]. (4.37)

As [x∗, R(x∗)] is a feasible solution to (4.33) based on (4.34), (4.35), (4.36),

the inequality (4.37) implies that the objective function of the bilevel problem

(4.33) takes its maximal value at [x∗, R(x∗)]. Therefore, [x∗, R(x∗)] is the optimal

solution to (4.33).

Theorem 3. Consider the bilevel model with one upper-level decision agent (re-

tailer) shown as Eqs.(4.29 - 4.32) and N independent lower-level decision agents

(customers) shown as Eq.(4.26) or Eq.(4.27). Then an optimal solution to the

bilevel model exists.

Proof. Firstly, according to Lemma 2, our considered bilevel pricing optimization

problem given in (4.26), (4.27) and (4.29)-(4.32) with one upper-level decision

agent (retailer) and N independent lower-level decision agents (customers) is

equivalent to the bilevel model with one upper-level decision agent and one lower-

level decision agent. Therefore we only need to prove the existence of the optimal

Page 106: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 106

solution in the formation of the optimization problem as (4.33).

For each decision variable ph (h = 1, ..., 24) in decision variable vector P =

(p1, ..., p24) at the upper level problem, it only takes finite values (in practice it is

often one decimal after the small unit in a given currency), i.e. the price at each

hour h can only take Dh values, where Dh is the number of possible price values

within interval [pminh , pmax

h ]. As a result, D =∏24

h=1Dh is a finite integer.

We denote the space of all prices across 24 hours (i.e. P ) at the upper-level

problem as UP . Noting the total number of elements of set UP is D, which is

a finite integer, it is implied immediately that UP is a finite space. Therefore,

based on Lemma 3, it is implied that the optimal solution to the considered bilevel

pricing problem exists.

4.5.2 Solutions to the Lower-level Problem

As the lower-level optimization problem is the sum of three separable sub-optimization

problems corresponding to interruptible, non-interruptible, and curtailable appli-

ances respectively, the lower-level problem can be solved by solving each sub-

optimization problem separately.

Interruptible Appliances

The mathematical model of interruptible appliances is shown as Eqs.(4.7 - 4.9)

which is a typical linear programming problem and can be solved using the opti-

mization software.

Non-interruptible Appliances

We firstly define the sub-problem of the original model Eqs.(4.10 - 4.13) for non-

interruptible appliances as follows by fixing the start time at s′n,a ∈ [αn,a, βn,a −Ln,a].

minxhn,a

∑s′n,a+Ln,a

h=s′n,aph × xhn,a

s.t.∑s′n,a+Ln,a

h=s′n,axhn,a = En,a,

γminn,a ≤ xhn,a ≤ γmaxn,a ,∀h ∈ [s′n,a, s′n,a + Ln,a].

(4.38)

Page 107: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 107

Eq.(4.38) is a linear programming problem and can be solved by the optimiza-

tion software. As a result, the original problem Eqs.(4.10 - 4.13) can be solved

in an iterative manner.

Curtailable Appliances

The optimization problems Eqs.(4.20 -4.22) and Eqs.(4.23 - 4.25) for curtailable

appliances are linear programming problems and can be solved by the optimiza-

tion software.

4.5.3 Distributed Optimization Algorithms to the Upper

Level Problem

Due to the existence of the starting time and waiting time in the lower-level prob-

lem, which makes the lower-level problem non-differentiable and discontinuous, in

this subsection, we adopt GA based distributed optimization algorithms to solve

the profit maximization problem at the retailer’s side and show how the retailer

finds the optimal electricity day-ahead prices taking into account the customers’

responses.

To avoid too much data passing between the retailer and the smart meters

and reduce the number of generations for the GA (each generation needs to pass

a new group of prices distributedly to all customers/agents to re-compute their

optimal reactions which is very costly), we propose two strategies that improve the

algorithms’ performance : 1) use a larger population for the GA. This strategy is

based on the observations that the local optimization problems (customers side)

are simple and easy to compute, even with a very large population; 2) reduce

corresponding GA generations to improve the algorithm efficiency as such a large

population size can ensure the GA’s convergence.

Instead of simply increasing the population size, we propose a multi-population

GA method [MSB91] to tackle the problem, i.e. a single population is divided

into multiple sub-populations and each sub-population evolves in a traditional

GA way. In addition, the individuals migrate from one sub-population to an-

other from time to time, known as the island model [WRH99] and we use the

ring migration type topology where individuals are transferred between direc-

tionally adjacent sub-populations [TMKH96]. As the multi-population GA out-

performs simple GA and converges faster [CHF03] [CHF03], we choose to use

Page 108: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 108

Algorithm 3 Multi-population GA based pricing algorithm to Eqs.(4.29 - 4.32)executed by the retailer

1: Population Initialization, i.e. generating a population of N chromosomesrandomly.

2: Produce C sub-populations, i.e. each sub-population has N/C individuals.3: Each sub-population evolves in a traditional GA way shown as steps (4 – 9).4: for i=1 to N/C do5: The utility company announces strategy i, i.e. it announces a set of 24-hour

prices by decoding the ith chromosome to the smart meters (customers)via the two way communication infrastructure.

6: Receive the optimal response of each customer n (smart meter) to strategyi including the optimal energy consumption information.

7: Check the feasibility of strategy i to see if it satisfies all the constraintsEqs.(4.30 - 4.32). If not, handle the invalid individuals by the approachproposed in [Deb00]. Then, obtain the fitness value of strategy i.

8: end for9: A new generation of chromosomes are created by using the selection, crossover

and mutation operations.10: Migrations between sub-populations.11: Steps (3 - 10) are repeated until the stopping condition is reached.12: The retailer announces the final price vector to the smart meters (customers)

via LAN at the beginning of the scheduling horizon.

Algorithm 4 Energy management system executed by each smart meter (cus-tomer)

1: Receive the price information from the retailer.2: The smart meter calculates the energy consumption in response to prices by

solving the lower-level problem Eq.(4.26) or Eq.(4.27).3: The smart meter sends back the total energy consumption at each hour to

the retailer via the two way communication infrastructure.

multi-population GA instead of simple GA in our research to reduce the gen-

eration number and therefore the costly data passing between the retailer and

its smart meters. In the GA setting for each sub-population, binary encoding

and deterministic tournament selection without replacement is adopted. For the

crossover and mutation operations, we employ uniform crossover and bit flip mu-

tation respectively. The constraints for the upper level problem are handled by

the approach proposed in [Deb00].

Finally, the multi-population GA based distributed algorithms are shown in

Algorithm 3 and 4, which are implemented at the retailer-side and customer-side

Page 109: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 109

respectively. To implement the proposed multi-population GA, we modify the

multi-population GA in [CF95] to make it applicable to our problem.

At the end, the most profitable prices for the retailer and the best usage

patterns and schedules with the maximized benefits for each customer are found.

4.5.4 Benefits of the Proposed Distributed Optimization

Algorithms

In this subsection, we discuss the benefits of our proposed distributed opti-

mization framework and how it gets around the privacy and scaling problems.

One the one hand, our proposed distributed optimization algorithms can pre-

serve customers’ privacy based on the assumption that there exists a third regu-

lation party between the retailer and its customers. Each customer’s anticipated

energy consumption data is firstly uploaded to the third party and processed

there. The third party only pass the aggregated consumption data of a whole

region or an abstract model to the retailer for pricing optimization. That is, the

retailer does not have direct access to each individual customer’s data.

On the other hand, our proposed distributed optimization algorithms are

promising in solving large-scale demand response problems according to the con-

vergence analysis which is going to be presented and discussed in Section 4.6.

From the convergence analysis and Figure 4.4, we can see that with the increase

of customer number, the convergence speed of the multi-population genetic al-

gorithm based distributed algorithms does not increase much. Although we are

not able to implement the simulations scaled up to a higher number of customers

(e.g., a million customers) due to the computer hardware resource limitations, we

can estimate from the existing results that our proposed distributed algorithms

are very promising in solving large-scale demand response problems.

4.6 Numerical Results

We simulate a neighbourhood consisting of 100 customers served by one energy

retailer. It is assumed that each customer has 4 appliances: PHEV, dishwasher,

washing machine and air-conditioning. The scheduling horizon is set from 8AM

to 8AM (the next day). We consider heterogeneous customers, i.e. customers

are different in terms of energy usage and appliance settings. In the following,

Page 110: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 110

we give the parameter settings for both the lower-level model and the upper-level

model.

Note that, in the following, αn,a and βn,a are uniformly distributed integers

for all appliances settings.

For PHEV, recall from Eqs.(4.7 – 4.9), En,a is chosen from the uniform dis-

tribution on [9, 11] kWh. Each γminn,a is 0 kWh, and each γmaxn,a is chosen from the

uniform distribution on [2.5, 3.3] kWh. αn,a is chosen from the uniform distribu-

tion on [6, 9] PM, and βn,a is chosen from the uniform distribution on [5, 8] AM

(the next day).

For dishwasher, En,a is chosen from the uniform distribution on [2.3, 2.9] kWh.

Each γminn,a is 0 kWh, and each γmaxn,a is chosen from the uniform distribution on

[1.2, 1.7] kWh. αn,a is chosen from the uniform distribution on [8, 11] AM, and

βn,a is chosen from the uniform distribution on [6, 9] PM.

For washing machine, En,a is chosen from the uniform distribution on [1.8, 2.3]

kWh. Each γminn,a is 0 kWh, and each γmaxn,a is chosen from the uniform distribution

on [1.0, 1.5] kWh. αn,a is chosen from the uniform distribution on [6, 9] PM, and

βn,a is chosen from the uniform distribution on [5, 8] AM (the next day).

For air-conditioning, with the purpose of simulation, we assume that all the

customers choose the second optimization model, i.e. maximize energy consump-

tion subject to an acceptable financial constraint. As a result, uhn,a is chosen from

the uniform distribution on [0.5, 0.8] kWh, and uhn,a is chosen from the uniform

distribution on [1.8, 2.2] kWh. αn,a is chosen from the uniform distribution on

[4, 6] PM, and βn,a is chosen from the uniform distribution on [9, 11] PM (the

next day). Cmaxn,a is chosen from the uniform distribution on [70, 90] pence.

Furthermore, the upper bound of hourly energy consumption for each house-

hold Emaxn is chosen from the uniform distribution on [3.5, 4.5] kWh.

The maximum waiting time for PHEV, dishwasher and washing machine is

set to 3 hours and the financial thresholds set by the customers are shown in

Table 4.2.

For the cost of the energy provided to customers by the utility company, we

model this as a cost function shown as Eq.(4.28). We assume that bh = 0, ch = 0

for all h ∈ H. Also, we have ah = 5.5×10−4 pence during the day, i.e. from 8AM

to 12AM and ah = 4.0× 10−4 pence at night hours, i.e. from 12AM to 8AM (the

next day).

In this section, firstly the convergence analysis of our proposed algorithms is

Page 111: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 111

Table 4.2: Financial thresholds of waiting length

Waiting Length PHEV (Financial Threshold) Dishwasher Washing machine1 hour 12 pence 10 pence 10 pence2 hours 25 pence 20 pence 20 pence3 hours 45 pence 35 pence 30 pence

given. Secondly, we show the benefits to the retailer by employing our proposed

day-ahead pricing scheme, which is compared with a flat pricing scheme. Thirdly,

we present the benefits to customers by adopting our proposed home energy

management scheme.

4.6.1 Convergence Analysis

In this subsection, we firstly test the aforementioned two strategies (i.e. increasing

the population and reducing the generation) in terms of convergence for the GA.

Then we test the convergence speed of the proposed distributed optimization

algorithm with different numbers of customers, where the numbers of customers

range from 100 to 1000.

We use the generation number to represent the convergence speed of the multi-

population GA/simple GA, which is widely used [LR93] [BN13] and easy to im-

plement and compare.

We compare the multi-population GA with the traditional/simple GA. The

results shown in Figure 4.3 indicate that under simple GA, the algorithm con-

verges at around the 150th generation. In contrast, the multi-population GA

only needs around 35 generations to converge, which can efficiently reduce the

data communication between the retailer and the smart meters.

Furthermore we conduct simulations to show the convergence speed of our

proposed multi-population GA and the simple GA under different customer num-

bers as shown in Figure 4.4. It is worth mentioning that the convergence speed

does not change much when customer numbers increase, which indicates our pro-

posed distributed optimization algorithm is rather scalable with the number of

customers. For example, when there are 1000 customers, the convergence speed

under multi-population GA is only around 40 generations.

Page 112: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 112

0 50 100 150 2001

1.05

1.1

1.15

1.2

1.25

1.3

1.35

1.4x 10

4

Generations

Fitn

ess

valu

e

Multi−Population GASimple GA

Figure 4.3: Convergence speed of the multi-population GA and the simple GA

100 200 300 400 500 600 700 800 900 10000

20

40

60

80

100

120

140

160

180

200

Customer number

Con

verg

ence

spe

ed (

gene

ratio

ns)

Multi−population GASimple GA

Figure 4.4: Convergence of the multi-population GA and the simple GA underdifferent customer numbers

Page 113: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 113

Table 4.3: Parameter settings of the multi-population GA

Parameter Name Symbol ValuesNumber of Sub-population Sp 15Sub-population Size N 40Migration Rate Mr 0.2Chromosome Length L 10Mutation Probability Pm 0.01Terminate Generation T 100

4.6.2 Benefits to the Retailer

In this subsection, we compare our proposed optimal day-ahead pricing scheme

with optimal flat pricing scheme. The parameter settings of our proposed multi-

population GA are shown in Table 4.3.

Since the customers have no incentives to change their energy consumption

patterns when responding to flat pricing, we assume that, under the flat pricing,

the customers start the operations of appliances right at the beginning of the

scheduling window Ha and the appliances work at their typical power levels.

We assume that, for each hour h, 8.0 pence ≤ ph ≤ 14.0 pence holds. When

calculating the optimal flat pricing, we use the same parameters and model as

those of optimal day-ahead pricing.

The obtained optimal day-ahead prices and flat prices are given in Figure 4.5.

Finally, the details of revenue, cost and profit under optimal day-ahead prices

and optimal flat prices can be found in Table 4.4.

From Table 4.4, we can see that, to make the same revenue (i.e. the total bills

for all customers are the same), the cost of the retailer under optimal day-ahead

pricing is 120.08 pounds and the cost under optimal flat pricing is higher (139.35

pounds). This is due to the increase of peak demand and thus the increase of

peak-time cost. Furthermore, the profit under optimal day-ahead pricing (134.92

pounds) is higher than the profit under optimal flat pricing (115.65 pounds).

The example shows a very important potential for the day-ahead pricing and our

proposed approach: the day-ahead pricing enables to increase the retailer’s profit

without increasing customers’ expenses.

Page 114: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 114

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM0

2

4

6

8

10

12

14

Hour ending

Pric

e (p

ence

/kw

h)

Optimal Day−ahead PricesOptimal Flat Prices

Figure 4.5: Obtained optimal day-ahead prices and flat prices

Table 4.4: Revenue, cost and profit under different pricing strategies

Price setting Revenue (pounds) Cost (pounds) Profit (pounds)

Optimal Day-ahead Pricing 255.00 120.08 134.92Optimal Flat Pricing 255.00 139.35 115.65

4.6.3 Benefits to Customers

In this subsection, we show the effectiveness of the proposed energy management

scheme based on public day-ahead price data. We use the actual electricity prices

data adopted by ISO New England from January 1, 2012 to January 31, 2012,

which is available to the public on-line at [ISO12]. In the following, we will show

the result of the first customer.

The simulation result is shown as Figure 4.6 where we can find that, after

adopting the energy management scheme, the daily bill payment is significantly

reduced. Furthermore, we show that by adopting our proposed financial incentive

based waiting time scheme, the customers may get further benefits in terms of

reducing their payments subject to acceptable life comforts.

Page 115: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 4. SMART PRICING TO DEMAND RESPONSE II 115

1/01/2012 16/01/2012 31/01/2012−40

−20

0

20

40

60

80

100

120

Day

Dai

ly E

lect

ricity

Pay

men

t(pe

nce)

with scheduling and waiting timewith scheduling but no waiting timewithout scheduling

Figure 4.6: Daily electricity payment of one customer over one month

4.7 Chapter Summary

In this chapter, we model the interactions between the retailer and its customers

as a bilevel optimization problem. Firstly, according to the load types, we catego-

rize home appliances into interruptible, non-interruptible and curtailable appli-

ances. For different category of appliances, different appliance-level optimization

models are given, which forms the lower level problem. Secondly, as the com-

mon solutions to the bilevel optimization problem such as KKT condition based

approach are not usable in our application setting (i.e. large scale problems), a

hybrid optimization approach based on multi-population genetic algorithms and

individual optimization solutions has been proposed. The numerical results show

that both the retailer and its customers can benefit from the proposed model.

Page 116: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Chapter 5

Smart Pricing to Demand

Response Management III – A

Learning based Approach

5.1 Introduction

In Chapter 3 and Chapter 4, we assume that there are home energy management

systems (HEMS) embedded in the smart meters and the interactions between

the retailer and its customers are modelled via a Stackelberg game or bilevel

optimization model. However, for many customers, they may not have such a

HEMS at the moment. Further, even though a customer is installed with HEMS,

it is still very difficult or impossible for a retailer to identify the right utility

function to model the customer’s usage behaviour due to various uncertainties in

usage patterns. As a result, to determine retail prices, the retailer has to learn

the customers’ energy consumption patterns.

This chapter is adapted from our two papers [MZM13] [MZssb]. The rest of

this chapter is organized as follows. Firstly, the preliminary knowledge of relevant

machine learning concepts, which is to be used in the customer behaviour learning

models, is given in Section 5.2. Secondly, the problem statement to identify

the demand response management problem for the retailer is given in Section

5.3. Thirdly, the customer behaviour learning models are presented in Section

5.4. More specifically, a probabilistic behaviour learning model is proposed for

shiftable appliances and a linear price-demand model is proposed for curtailable

appliances. Fourthly, the smart pricing based demand response management

116

Page 117: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 117

problem formulation and GA based distributed solution algorithms are given in

Section 5.5. Fifthly, numerical results that show the benefits of the proposed

smart pricing approach to the retailer and its customers are given in Section 5.6.

This chapter is concluded in Section 5.7.

5.2 Preliminary Knowledge – Machine Learning

Machine learning evolves from the study of pattern recognition and computa-

tional learning theory in artificial intelligence. Machine learning explores the

construction and study of algorithms that can learn from data automatically to

perform well in a given task. Online machine learning is used in the case where

the data becomes available in a sequential fashion, in order to determine a map-

ping from the dataset to the corresponding labels. In online learning the mapping

is updated after the arrival of every new data point whereas in batch learning

(or “offline” learning) techniques, all the observations are used simultaneously to

estimate the model.

In the following, we will look at relevant machine learning concepts such

as Bayes’ Theorem, Bayesian updating [RN09], linear regression analysis and

recursive identification methods, which are to be used in the customer behaviour

learning models in Section 5.4.

5.2.1 Conditional Probability and Bayes’ Theorem

In probability theory, a conditional probability measures the probability of an

event given that another event has occurred.

Let A and B be two events. If N is the total number of possible outcomes

and NA and NB are the number of outcomes corresponding to A and B, then

the probabilities of A and B, denoted P (A) and P (B), are: P (A) = NA/N and

P (B) = NB/N .

Events are said to be mutually exclusive if they have no outcomes in common.

For mutually exclusive events the following holds:

P (A ∩B) = P (∅) = 0,

P (A ∪B) = P (A) + P (B).

Page 118: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 118

Further the probability of A occurring, given that B has occurred, is the

conditional probability of A, denoted P (A|B) and the probability of B occurring,

given that A has occurred is the conditional probability of B, denoted P (B|A).

If NAB are the number of outcomes corresponding to both A and B, then

P (A|B) =NAB

NB

=NAB

N

N

NB

=P (A ∩B)

P (B). (5.1)

Similarly,

P (B|A) =NAB

NA

=P (A ∩B)

P (A). (5.2)

In the following, we extend the discussion of conditional probability to Bayes’

theorem which is useful for updating a probability value based on additional

information that is later obtained. To understand this, we firstly introduce two

concepts:

• A prior probability is an initial probability value originally obtained

before any additional information is obtained.

• A posterior probability is a probability value that has been updated by

using additional information that is later obtained.

From Eqs.(5.1) and (5.2), the following Bayes formula can be derived:

P (A|B) =P (B|A)P (A)

P (B). (5.3)

where P (B) = P (B|A)P (A) + P (B|A)P (A). Eq.(5.3) is the well-known Bayes’

Theorem and A means the event “not-A” while P (A) is called prior probability

and P (A|B) is called the posterior probability.

5.2.2 Bayesian Inference and Updating [BT11]

Bayesian inference is a method of statistical inference in which Bayes’ theorem

is used to update the probability for a hypothesis (denoted as H) as evidence or

data (denoted as D) is acquired.

Bayesian inference derives the posterior probability P (H|D) as a consequence

of two antecedents, a prior probability P (H) and a “likelihood function” P (D|H)

derived from a statistical model for the observed data. Bayesian inference com-

putes the posterior probability according to the following Bayes’ theorem.

Page 119: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 119

P (H|D) =P (D|H)P (H)

P (D).

Bayesian updating: The process of going from the prior probability P (H)

to the posterior P (H|D) is called Bayesian updating. Bayesian updating is partic-

ularly important in the dynamic analysis of a sequence of data. We can express

Bayes’ theorem in the context of Bayesian updating as a statement about the

proportionality of two functions of H, i.e. P (D|H) and P (H), which is given as

follows:

P (H|D) ∝ P (D|H)× P (H).

5.2.3 Linear Regression Analysis

Regression analysis is a statistical tool for investigating the relationships between

variables such as the effect of a price increase upon demand. In the following,

we are mainly focusing on the linear regression model and the corresponding

parameter estimation method, i.e. least square method.

Given a data set y(t), x1(t), . . . , xn(t)Tt=1, a linear regression model assumes

that the relationship between the dependent variable y(t) and the independent

variables x(t) = x1(t), . . . , xn(t) is modelled through an “error variable” e(t) –

an unobserved random variable that adds noise to the linear relationship between

the dependent variable and independent variables. Thus the linear regression

model takes the form

y(t) = β0 +β1x1(t) + · · ·+βnxn(t) + e(t) = xT(t)β+ e(t), t = 1, 2, ...T (5.4)

The above model can also be written in matrix notation as

y = Xβ + e, (5.5)

where y =

y(1)

y(2)...

y(T )

and e =

e(1)

e(2)...

e(T )

are T × 1 vectors,

Page 120: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 120

X =

1 x1(1) · · · xn(1)

1 x1(2) · · · xn(2)...

.... . .

...

1 x1(T ) · · · xn(T )

is a T × (n + 1) matrix and β =

β0

β1

...

βn

is a

(n+ 1)× 1 vector.

Suppose b is a candidate value for the parameter β and then the sum of

squared residuals (also known as loss function) can be defined as:

S(b) =T∑t=1

(y(t)− xT(t)b

)2= (y −Xb)T (y −Xb), (5.6)

The Ordinary Least Square (OLS) method which minimizes the sum of squared

residuals is adopted to solve the above quadratic minimization problem. By tak-

ing the derivatives of S(b) with respect to b, we can get the optimal parameter

estimates of the linear regression model Eq.(5.4):

β = arg minbS(b) = (XTX)−1XTy (5.7)

5.2.4 Recursive Identification

Given a current estimated model and a new observation, how should we update

this model in order to take this new piece of information into account? The answer

will be recursive identification method. In recursive identification methods, the

parameter estimates are computed recursively over time: suppose we have an

estimate θ(t− 1) at iteration t− 1, then recursive identification aims to compute

a new estimate θ(t) by simple modification of θ(t − 1) when a new observation

becomes available at iteration t.

The general recursive identification algorithm is given as follows:

θ(t) = θ(t− 1) +K(t) (y(t)− y(t)) (5.8)

where θ(t) is the parameter estimate at time t. y(t) is the observed output at

time t and y(t) is the prediction of y(t) based on observations up to time t − 1.

The gain, K(t), determines how much the current prediction error y(t) − y(t)

affects the update of the parameter estimate.

Example 1. Recursive estimation of a constant [SS88]: Consider the

following system

Page 121: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 121

y(t) = θ0 + e(t),∀t = 1, 2, ... (5.9)

where y(t) is the dependent variable, and e(t) is the corresponding error term.

The ordinary least-squares estimate of θ0 is given as follows:

θ(t) =1

t

t∑k=1

y(k) (5.10)

We can rewrite Eq.(5.10) into the following form:

θ(t) = 1t

(t−1∑k=1

y(k) + y(t)

)= 1

t

((t− 1)θ(t− 1) + y(t)

)= θ(t− 1) + 1

t

(y(t)− θ(t− 1)

) (5.11)

The above equation shows an appealing property: the new estimate θ(t) equals

the previous estimate θ(t− 1) plus a small correction term. The correction term

is proportional to the deviation of the prediction θ(t − 1) and the observation

y(t). Moreover the correction term is weighted by the term 1t, which implies that

the magnitude of the correction will decrease in time.

Example 2. Recursive Least Square Estimation [SS88]: Consider the

following linear model

y(t) = xT (t)θ0 + e(t) (5.12)

where y(t) is the dependent variable, x(t) is the independent variable and e(t) is

the corresponding error term.

The least-square estimate of θ0 is given by:

θ(t) =

[t∑

k=1

x(k)xT (k)

]−1 [ t∑k=1

x(k)y(k)

](5.13)

Denote P (t) =

[t∑

k=1

x(k)xT (k)

]−1

, we have

P−1(t) = P−1(t− 1) + x(t)xT (t). (5.14)

As a result, Eq.(5.13) can be written in the recursive form as follows:

Page 122: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 122

θ(t) = P (t)

[t−1∑k=1

x(k)y(k) + x(t)y(t)

]= P (t)

[P−1(t− 1)θ(t− 1) + x(t)y(t)

]= θ(t− 1) + P (t)x(t)

[y(t)− xT (t)θ(t− 1)

](5.15)

After denoting K(t) = P (t)x(t) and ε(t) = y(t)− xT (t)θ(t− 1), we have

θ(t) = θ(t− 1) +K(t)ε(t). (5.16)

In the above, the term ε(t) will be interpreted as the prediction error: it is the

difference between the observed sample y(t) and the predicted value xT (t)θ(t−1).

If ε(t) is ‘small’, the estimate θ(t− 1) is good and should not be modified much.

The matrix K(t) is interpreted as the weighting or ‘gain’ matrix.

To implement the algorithm, Eq.(5.14) is used to compute P (t), which is

further used to compute K(t). Note that Eq.(5.14) needs a matrix inversion at

each time step, which is very time consuming. In order to simply the computing

process of P (t), Matrix Inversion Lemma is firstly introduced.

Lemma 4. ( [Lju99]) Consider A, B, C and D all denote matrices of the correct

sizes, specifically, A is n-by-n, B is n-by-k, C is k-by-k and D is k-by-n. Then

the following exists:

[A+BCD]−1 = A−1 − A−1B[C−1 +DA−1B

]−1DA−1. (5.17)

Based on Lemma 4, we have Eq.(5.14) updated as follows:

P (t) = P (t− 1)− P (t− 1)x(t)xT (t)P (t− 1)

1 + xT (t)P (t− 1)x(t)(5.18)

Substitute the above equation into K(t), we can further simply the computing

process as follows:

K(t) = P (t)x(t) =P (t− 1)x(t)

1 + xT (t)P (t− 1)x(t)(5.19)

Finally, the complete recursive learning algorithm consists of Eqs.(5.16), (5.19)

and (5.18).

Page 123: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 123

5.2.5 Recursive Least Square with Forgetting Factor [Lju99]

In the following, we will present some modifications of the previous recursive least

square algorithm which are useful for identifying the time-varying parameters.

The forgetting factor approach starts from a slightly modified loss function:

Vt(θ0) =t∑

k=1

λt−k(y(k)− xT (k)θ0

)2. (5.20)

The loss function used earlier had λ = 1, but now it contains a forgetting

factor λ, a number less than 1 (e.g. 0.99 or 0.95). With increasing t, the past

observations are discounted. The smaller the value of λ, the quicker information

obtained from previous data will be forgotten. The recursive least square algo-

rithm with forgetting factor can be derived based on the modified loss function

Eq.(5.20), which is omitted here.

Finally, the modified algorithm is given as follows:

θ(t) = θ(t− 1) +K(t)ε(t).

ε(t) = y(t)− xT (t)θ(t− 1)

K(t) = P (t)x(t) =P (t− 1)x(t)

λ+ xT (t)P (t− 1)x(t)

P (t) =1

λ

(P (t− 1)− P (t− 1)x(t)xT (t)P (t− 1)

λ+ xT (t)P (t− 1)x(t)

)

5.3 Problem Statement

In this chapter, we consider a residential power network which consists of one

retailer and N customers.

It is assumed that each customer is equipped with a smart meter. The re-

tailer procures electricity from the wholesale market, announces the retail price

to the customers (smart meters) via the two-way communication infrastructure.

The customers then respond to the price and temperature signals by shifting or

curtailing their energy usage according to their preferences. It is assumed that

the customers are not installed with HEMS. As a result, the customers’ energy

Page 124: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 124

consumption behaviours are not known by the retailer. Instead, the retailer has

to learn the customers’ energy consumption patterns in order to implement the

price determination and customers’ behaviour analysis.

To learn the customers’ appliance-level usage patterns, the retailer needs to

know the historical energy consumption data of the appliances in response to

hourly price and temperature information. We employ a state-of-the-art tech-

nique called non-intrusive appliance load monitoring (NILM) to extract hourly

consumption and on/off time information of each appliance from the entire electri-

cal power consumption of each house by using signal analysis algorithms [Har92].

With the development of hardware such as smart meters, high resolution smart

meter data are available to the public [KJ11], which on the other hand pro-

motes the development of NILM techniques [ZR11]. As the existing NILM algo-

rithms [CCTL12] [CLCL13] can achieve a very high recognition accuracy, it will

ensure a high accuracy of our proposed customer-behaviour models and thereafter

pricing model.

Due to the fact that different home appliances have different load patterns,

we first categorize the appliances into shiftable appliances and curtailable appli-

ances. Secondly, we propose two appliance-level learning models to model how

the demand changes in response to price and temperature changes, which will be

described in Section 5.4. For shiftable appliances such as dish-washers and wash-

ing machines, the operations can be shifted from high price period to low price

period but the total energy consumption to finish the operations is fixed [MZ14].

By observing the above characteristic of shiftable appliances, a probabilistic be-

haviour model and its learning algorithm are proposed to model an individual

customer’s shifting probabilities when using shiftable appliances dependent on

different hourly prices. For curtailable appliances such as air-conditioning, the

operations cannot be shifted but the total energy consumption can be adjusted

according to customers’ preferences. By observing the above load patterns of cur-

tailable appliances, a multiple linear regression model is proposed to predict an

individual customer’s hourly energy consumption in response to different hourly

prices and temperatures.

Page 125: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 125

5.4 Customer Behaviour Learning Models

As aforementioned, the customers’ houses are assumed to have smart meters in-

stalled with the capability of two-way communication. The non-intrusive appli-

ance load monitoring (NILM) techniques are also embedded in the smart meters.

As a result, the historical data such as the hourly price and the hourly energy

consumption information of each shiftable and curtailable appliance at each hour

are available. Furthermore, the historical data of hourly temperature information

are also assumed to be available as a public dataset. With all the historical data

available, we are able to learn customers’ behaviour patterns in using shiftable

appliances and curtailable appliances.

For each customer n ∈ N , we define the set of shiftable appliances Sn and

curtailable appliances Cn. Furthermore, we define the whole appliance set as

An = Sn ∪ Cn. However, for notation simplicity, in the rest of this section, we

omit the subscript n.

5.4.1 Shiftable Appliances

Denote the scheduling window for appliance s as Hs , as, ..., bs, where as is the

earliest possible time to switch on the appliance s and bs is the latest possible time

to switch off. Let Ts = bs − as + 1 denote the length of the scheduling window.

Assume the available historical data for appliance s are electricity consumption

scheduling vector Xs(d) = xsas(d), xsas+1(d), ..., xsbs(d) (d = 1, 2, ..., D), where

xsh(d) (h = as, as + 1, ..., bs) represents electricity consumption during hour h by

appliance s on day d; xsh(d) > 0 and xsh(d) = 0 mean appliance s being on or

off at hour h respectively. Furthermore, the total time taken for appliance s to

finish is denoted as Ls and we assume that each time slot is 1 hour. However,

our proposed learning algorithms can be adapted to any sub-hour intervals (e.g.

10 minutes).

For each shiftable appliance s ∈ S, we denote the total energy consumption

to finish the operations as Es and we assume that each shiftable appliance runs

at a constant power rate. As a result, the energy consumption at each running

slot can be expressed as Es/Ls. The above assumption is based on the fact that

ordinary customers have no knowledge to tell the difference between this time

slot’s energy consumption and next time slot’s energy consumption for a specific

shiftable appliance. As a result, the customers will shift their energy usages

Page 126: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 126

assuming that the appliances run at a constant rate.

Given the above available information about shiftable appliance s, the specific

problem we aim to solve is how to model customers’ behaviour of using appliance

s. As far as we are aware, no research has been done to address this problem –

modelling customers’ behaviour at appliance level and for shiftable appliances.

There are several difficulties that need to be considered when solving this problem:

1. The conventional price-demand model approach is not applicable here as

the demand or consumption of shiftable appliances is fixed and thus we can

not use the price elasticity to model how the demand changes in response

to the price changes.

2. There are many different patterns of customers’ behaviour. Some customers

are pricing optimizers who will always find the cheapest period to use their

appliances via the billing minimization software; some customers are price

sensitive and try to use their appliances cheaply as far as it is convenient,

whereas some customers who are price insensitive users who just switch on

their appliances when they need without thinking about prices.

To overcome the above difficulties and model different customers’ usage pat-

terns as they happen in reality, a probabilistic behaviour model is to be proposed.

For appliance s of a given customer, based on the historical data showing the ac-

tual period that appliance s has been used in response to the given historical

prices on that day, the basic idea behind this model is to calculate the probabili-

ties that appliance s was used at the cheapest period, second cheapest period,...,

till the most expensive period.

Page 127: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 127

Tab

le5.

1:H

isto

rica

ldat

aab

out

dis

hw

asher

usa

gean

dpri

ces

Day

Sw

itch

onp

erio

dOTs(d

)P

rice

s(c

ent)

bet

wee

n[7

pm

,11p

m]

the

i-th

chea

pes

tti

me

per

iodPTs i(d

)A

pplian

cesw

itch

edon

ati-

thch

eap

est

per

iod

17

and

810

;12

;12

;14

1st=7

and

8;2nd

=8

and

9;3rd

=9

and

10

1st

chea

pes

t

28

and

910

;12

;12

;14

Sam

eas

abov

e2nd

chea

pes

t3

9an

d10

12;

14;1

2;10

1st

=9

and

10;

2nd

=7

and

8,8

and

91st

chea

pes

t

47

and

88;

10;

12;1

41st=7

and

8;2nd

=8

and

9;3rd

=9

and

10

1st

chea

pes

t

57

and

812

;10

;13

;14

Sam

eas

abov

e1st

chea

pes

t6

7an

d8

12;

10;

12;1

41st

=7

and

8,8

and

9;

2nd

=9

and

10

1st

chea

pes

t

77

and

88;

10;

12;1

61st=7

and

8;2nd

=8

and

9;3rd

=9

and

10

1st

chea

pes

t

89

and

1012

;10

;12

;91st

=9

and

10;

2nd

=7

and

8,8

and

91st

chea

pes

t

97

and

812

;10

;12

;9Sam

eas

abov

e2nd

chea

pes

t10

9an

d10

10;

12;

12;1

41st=7

and

8;2nd

=8

and

9;3rd

=9

and

10

3rd

chea

pes

t

Page 128: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 128

Tab

le5.

2:H

isto

rica

ldat

aab

out

PH

EV

usa

gean

dpri

ces

Day

Sw

itch

onp

erio

dOTs(d

)P

rice

s(c

ent)

bet

wee

n[7

pm

,11p

m]

the

i-th

chea

pes

tti

me

per

iodPTs i(d

)A

pplian

cesw

itch

edon

ati-

thch

eap

est

per

iod

17

and

810

;12

;12

;14

1st

=7

and

8,7

and

9;

2nd

=8

and

9,7

and

10;

3rd

=8

and

10,

9an

d10

1st

chea

pes

t

28

and

910

;12

;12

;14

Sam

eas

abov

e2nd

chea

pes

t3

9an

d10

10;

12;

12;1

4Sam

eas

abov

e3rd

chea

pes

t4

7an

d9

10;

12;

12;1

4Sam

eas

abov

e1st

chea

pes

t5

8an

d10

10;

12;

12;1

4Sam

eas

abov

e3rd

chea

pes

t6

8an

d10

12;

10;

12;9

1st

=8

and

10;

2nd

=7

and

10,

9an

d10;

3rd

=7

and

8,8

and

9;

4th

=7

and

9

1st

chea

pes

t

77

and

812

;10

;12

;9Sam

eas

abov

e3rd

chea

pes

t8

8an

d9

12;

10;

12;9

Sam

eas

abov

e3rd

chea

pes

t9

9an

d10

12;

10;

12;9

Sam

eas

abov

e2nd

chea

pes

t10

7an

d9

12;

10;

12;9

Sam

eas

abov

e4th

Chea

pes

t

Page 129: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 129

In order to represent the above idea in mathematical formulas, some notation

is introduced. Let PT si (d) denote the i-th cheapest time period for appliance s

on day d, OT s(d) denote the operation (i.e. switch on) period for appliance s on

day d and P si denote the probability that a customer uses appliance s at the i-th

cheapest period. Now some examples are given to illustrate the above notations.

In the following, we firstly clarify the time periods to be used in the following two

examples. For example, “7 and 8” in Table 5.1 stands for “7pm-8pm and 8pm-

9pm”. Due to the space limitation of the tables, we use the shorter representation,

i.e. “7 and 8”. Nevertheless, all the other time periods in Table 5.1 and 5.2 follow

the above definition.

Example 1. Let appliance s be a dish washer which requires non-interruptible

use for Ls = 2 hours to complete the washing task and its scheduling window be

Hs = 7pm, 8pm, ..., 11pm. Assume that the historical data about its usage are

given in the Table 5.1.

Based on Table 1, it can be seen that the dish washer was used 7 out of 10

times at the cheapest price period, i.e. P s1 = 7/10 = 0.7. Similarly, 2 out of 10

times at the 2nd cheapest price period, i.e. P s2 = 2/10 = 0.2 and 1 out of 10 times

at the 3rd cheapest price period, i.e. P s3 = 1/10 = 0.1. From these probabilities,

it can be found that the customer is a relatively price sensitive customer as he

tried to use the dish washer at the cheapest price period for most of the time.

Example 2. Let appliance s be a PHEV which requires interruptible charge for

Ls = 2 hours per day to meet the daily driving need and its scheduling window

be Hs = 7pm, 8pm, ..., 11pm. Assume that the historical data about its usage

are given in the Table 5.2.

Based on Table 2, it can be seen that the PHEV was used 3 out of 10 times at

the cheapest price period, 2 out of 10 times at the 2nd cheapest period, 4 out of 10

times at the 3rd cheapest period and 1 out of 10 times at the 4th cheapest. That

is, P s1 = 3/10 = 0.3, P s

2 = 2/10 = 0.2, P s3 = 4/10 = 0.4 and P s

4 = 1/10 = 0.1.

From these probabilities it can be found that the customer is a relatively price

insensitive customer as he often used the PHEV at less cheap price periods.

Although the above ideas and intuitive examples give great insights on how

the learning algorithm works, there are some further issues such as how to deal

with uncertainties in the price signals that need to be solved. Motived by this, a

formal learning model with theoretic sound analysis is presented as follows.

Page 130: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 130

Before giving more technical details, we make some assumptions and clarifi-

cations for this learning model.

1. This learning model is constructed in the context of smart pricing, i.e. the

hourly prices should have some differences over different days. This model

does not apply to situation that the electricity prices are fixed at each hour

for every day (such as the flat prices we are currently using in the UK).

2. Our probabilistic learning model is a function of prices. We treat day d in

the learning model as a time index.

3. As the learning model is recursively updated, the noise or outlier can be

captured.

If the historical data includes d = 1, 2, . . . D and the hourly prices are all dif-

ferent within each day, then the formula to calculate the probability that shiftable

appliance s is on at the i-th cheapest price period based on the history data up

to day d is

P si (d) =

fi(d)

dd = 1, 2, ... (5.21)

where fi(d) represents the number of days when appliance s is on at the i-th

cheapest price period within the past d days. For notation simplicity, the su-

perscript s at the right hand side of (5.21) is omitted. Let the current day be

d and then fi(d) and P si (d) are known. Let δi(d + 1) represent the probability

that appliance s is on at the i-th cheapest period based on the data of day d+ 1

(δi(d + 1) being 1 if s is on at the i-th cheapest period or being 0 otherwise)

and then it becomes a new piece of information to be used to obtain P si (d + 1).

This is called a new piece of information, as it is the information or observation

received on day d + 1 and is not available on day d or before. As a result, the

above formula can be rewritten in a recursive way as follows:

P si (d+ 1) = fi(d+1)

d+1= fi(d+1)

d· d

(d+1)

= [fi(d)+δi(d+1)]d

· [(d+1)−1](d+1)

= [fi(d)+δi(d+1)]d

− [fi(d)+δi(d+1)]d

· 1(d+1)

= P si (d) + δi(d+1)

d− 1

(d+1)P si (d)− δi(d+1)

d· 1

(d+1)

= P si (d)− 1

(d+1)P si (d) + 1

(d+1)δi(d+ 1)

= P si (d) + 1

d+1[δi(d+ 1)− P s

i (d)].

(5.22)

Page 131: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 131

The above recursive formula shows that the updating of probability P si (d)

takes the form often used in machine learning algorithms: when a new piece of

information δi(d+1) is received, the updated probability, P si (d+1), is equal to the

existing probability P si (d) plus an adjusting term. The adjusting term includes

the adjusting coefficient 1/(d+1) and the predicted error term [δi(d+1)−P si (d)].

This term is called predicted error term, as δi(d + 1) − P si (d) is the difference

between the actual probability and the predicted probability where δi(d + 1) is

the actual probability that the appliance is on at the i-th cheapest period based

on the data of day d+1 and P si (d) is the estimated probability that an appliance is

on at the i-th cheapest period based on the historical data up to d. If δi(d+1) = 1,

that is, s is on at the i-th cheapest period on day d+1, then P si (d+1) is adjusted

up a little as the new information δi(d + 1) = 1 provides a positive adjustment

(as [δi(d + 1) − P si (d)] > 0) and then reinforces the probability; If δi(d + 1) is

0, then P si (d + 1) is adjusted down a little as the new information δi(d + 1) = 0

provides a negative adjustment (as [δi(d+ 1)−P si (d)] < 0) and then weakens the

probability.

So far, the recursive formula to calculate probability P si (d) is under the as-

sumption that the hourly prices are all different within each day and there is

a strict order between the cost (i.e. sum of the hourly prices) of each possible

period that s is on within each day and therefore δi(d) only takes its value as 1 or

0. However, under some circumstances, many hourly prices are the same within

a day and possibly for many days, which is due to the fact that peak-hour prices

are often the same at the maximum acceptance price bound. This is because

higher prices will create a poor price image and lead to the dissatisfaction/com-

plain from customers and the political against from the government/regulation

body, while lower prices will be unable to cover the extra costs to generate the

peak-hour power and lead to the loss of profit. As a result, this could result in

two or more periods when s is possibly on having the same costs such as there

are more than one i-th cheapest price period. Now we analyse how to extend the

recursive formula to handle such a more complicated case.

To start the analysis, consider an example first. Suppose that appliance s has

4 possible operation periods which are denoted as OTi (i = 1, 2, 3, 4). Further

assume that costs for OT1, OT2, OT3 and OT4 are the 1st, 2nd, 3rd and 4th cheapest

respectively for the first d = 10 days and the corresponding probabilities that s

Page 132: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 132

is on at the i-th cheapest period are given as below:

P s1 (10) = 5

10, P s

2 (10) = 410,

P s3 (10) = 1

10, P s

4 (10) = 0.(5.23)

Now assume that from day 11, the costs for the first 3 periods (i.e. OT1 ,

OT2 , OT3 ) are the same and are cheaper than OT4 and s is always on at OT2 .

In this case, we cannot simply say that the probability that appliance s is on at

the i-th cheapest period on day d + 1 = 11, i.e. δi(d + 1) = δi(11) (i = 1, 2, 3),

is equal to 1 or 0, as it is impossible to distinguish if s is on at the 1st, 2nd or

3rd cheapest period. A simple and intuitive solution to determine the probability

in this case is to assign the probabilities as the even distribution and therefore

δi(d+ 1) = δi(11) = 1/3 (i = 1, 2, 3). However, this is not really a most accurate

estimation as we know from the probability estimations given in Eq.(5.23) that

the cheapest, the 2nd cheapest and 3rd cheapest periods are the 1st, 2nd and 3rd

likely choice. Therefore, a more theoretic sound estimation method for δi(d+1) =

δi(11) (i = 1, 2, 3) needs to be developed. Such a method is illustrated as follows:

given the event that s is on at OT2 , it is implied that appliance s has been chosen

to be on at the cheapest, the 2nd cheapest or 3rd cheapest period but we cannot

determine exactly which one is happening. Denote event

Ai = s on i− th cheapest period i = 1, 2, 3. (5.24)

Then the problem to determine probability δi(d + 1) = δi(11) is, under the

condition that A1, or A2, or A3 occurs, what is the probability that s is on the i-th

cheapest period on day d+ 1. That is, the probability is to estimate or calculate

the following conditional probability:

δi(d+ 1) = P (Ai| A1 ∪ A2 ∪ A3)

where ∪ represents the set or event union. Based on the definition of the condi-

tional probability, and the fact that events A1, A2, and A3 are mutually exclusive,

we haveδi(d+ 1) = P (Ai| A1 ∪ A2 ∪ A3) = P [Ai∩(A1∪A2∪A3)]

P (A1∪A2∪A3)

= P (Ai)P (A1)+P (A2)+P (A3)

=P si (d)

P s1 (d)+P s

2 (d)+P s3 (d)

.(5.25)

In the above, the last equality is obtained via replacing P (Ai) (which is the

Page 133: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 133

probability that s is on the i-th cheapest period) by the estimated probability

using the historical data up to d. In fact, the above method and formula is the

basically well-known Bayesian updating method [RN09]. Based on the above

formula and (5.23), it is implied

δ1(d+ 1) =P s1 (d)

P s1 (d)+P s

2 (d)+P s3 (d)

= 0.50.5+0.4+0.1

= 0.5,

δ2(d+ 1) =P s2 (d)

P s1 (d)+P s

2 (d)+P s3 (d)

= 0.40.5+0.4+0.1

= 0.4,

δ3(d+ 1) =P s3 (d)

P s1 (d)+P s

2 (d)+P s3 (d)

= 0.10.5+0.4+0.1

= 0.1.

Substituting these into formula (5.22), we have

P s1 (d+ 1) = P s

1 (d) + 1d+1

[δ1(d+ 1)− P s1 (d)] = 0.5,

P s2 (d+ 1) = P s

2 (d) + 1d+1

[δ2(d+ 1)− P s2 (d)] = 0.4,

P s3 (d+ 1) = P s

3 (d) + 1d+1

[δ3(d+ 1)− P s3 (d)] = 0.1.

Similarly, under the assumption that the costs for the first 3 periods (i.e,

OT1 , OT2 , OT3 ) are the same from day 11, it can be implied that P s1 (d+ k) =

0.5, P s2 (d + k) = 0.4, P s

3 (d + k) = 0.1 (k = 1, 2, ...). The intuitive explanation

behind this result is that these flat price events (appliance s is on at one of the

three same price periods) do not provide any new information and therefore our

knowledge about the probabilities that s is on the i-th cheapest period are kept

the same as we know up to day 10.

Now we consider the same example but under a different scenario, in which it

is assumed that, from day 11, the costs for the first 2 periods (i.e. OT1, OT2) are

the same and are cheaper than OT3 while the cost for OT3 is cheaper than that for

OT4 and s is always on at OT1 or OT2. In this scenario, it is expected intuitively

that P s3 (10) = 1

10will be reduced as s is never on at OT3 and P s

1 (10) = 510

or

P s2 (10) = 4

10both will increase as s is always on at these price periods. Now the

mathematical analysis and reasoning is going to be obtained based on the same

method as the previous scenario.

Under the notation given in (5.24), we firstly determine that, under the con-

dition that A1 or A2 occurs, what is the probability that s is on the i-th cheapest

on day d+ 1 with i = 1, 2. That is, what probability δi(d+ 1) = δi(11) is. Using

the similar reason as (5.25) based on the definition of conditional probability, it

Page 134: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 134

is obtained

δ1(d+ 1) = P (A1|A1

⋃A2) = P (A1)

P (A1)+P (A2)=

P s1 (d)

P s1 (d)+P s

2 (d)= 0.5

0.5+0.4= 5

9,

δ2(d+ 1) = P (A2|A1

⋃A2) = P (A2)

P (A1)+P (A2)=

P s2 (d)

P s1 (d)+P s

2 (d)= 0.4

0.5+0.4= 4

9.

(5.26)

Substituting these into formula (5.22), we have

P s1 (d+ 1) = P s

1 (d) + 1d+1

[δ1(d+ 1)− P s1 (d)] = 0.5051,

P s2 (d+ 1) = P s

2 (d) + 1d+1

[δ2(d+ 1)− P s2 (d)] = 0.4044.

(5.27)

That is, we have P (A1) and P (A2) updated based on Bayesian updating

method to become P (A1) = P s1 (d + 1) = 0.5051 and P (A2) = P s

2 (d + 1) =

0.4044 respectively. Using this as the new start point for further updating by

following the same steps as (5.26) and (5.27), we can obtain P s1 (d+ 2) = 0.5093,

P s2 (d + 1) = 0.4374. Repeating the process, it can be found that P s

1 (d + k)

and P s2 (d + k) steadily increase with k increasing and converge to 5/9 and 4/9

respectively.

Now we turn to see how probability P s3 (d + k) is updated under the given

scenario. As s is never on at the 3rd cheapest period since day 11, then δ3(d+k) =

0 (k = 1, 2, ...) and then P s3 (d+ k) is updating based on equation (5.22) as

P s3 (d+ 1) = P s

3 (d) + 1d+1

[δ3(d+ 1)− P s3 (d)] = 0.0909,

P s3 (d+ 2) = P s

3 (d+ 1) + 1d+2

[δ3(d+ 2)− P s3 (d+ 1)] = 0.0833

Repeating the above process, it can be found that P s3 (d+k) steadily decreases

with k increasing and converges to 0 at the end.

From the above analysis, the following observation is obtained: Firstly the

proposed updating method for the probability estimation based on Bayesian up-

dating fits well with the intuitions; secondly the reason that the final updating

probabilities are P s1 = 5/9, P s

2 = 4/9, and P s3 = 0 lies in that the new information

is obtained from the event that s is on only at the two cheapest price periods.

This new information reinforces the probabilities of P s1 and P s

2 and weakens the

probability of P s3 following the fact that s is never on at the 3rd cheapest price

period.

The above example with the two scenarios illustrates and justifies the basic

idea and method to be used for the usage probabilities P si and the general formula

for the probability updating is now given as follows:

Page 135: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 135

P si (d+ 1) = P s

i (d) +1

d+ 1[δi(d+ 1)− P s

i (d)] (5.28)

where P si (d) is the estimated probability that s is on at the i-th cheapest price

period based on the historical data up to d, P si (d+ 1) is the updated probability

based on the new information from new usage data on d+ 1, and δi(d+ 1) is the

probability that appliance s is on at the i-th cheapest period on day d+ 1. When

the hourly prices and usage data are received at the end of day d + 1, δi(d + 1)

is calculated based on the following three cases.

Suppose that there are k possible operation periods for s and the cost of

each period (i.e. sum of the hourly prices during the period) on day d + 1 is

cj(d + 1) (j = 1, ..., k). When ranking cj(d + 1) in ascending order, two or more

time periods may have the same cost. In this case, these same costs are forced to

be ranked in different orders. As a result, the cost at h-th cheapest time period

is denoted as rh(d+ 1) (h = 1, ..., k).

• Case 1. s is not on at the i-th cheapest price period on day d+ 1

δi(d+ 1) = 0;

• Case 2. s is on at the i-th cheapest price period on day d+ 1 with the cost

ri(d+ 1) which satisfies ri(d+ 1) 6= rh(d+ 1) (∀h, h 6= i), then

δi(d+ 1) = 1;

• Case 3. s is on at the i-th cheapest price period on day d+ 1 with the cost

ri(d+ 1) but there is one or more other periods where rhl(d+ 1) = ri(d+ 1) (l =

1, ..., k0), then

δi(d+ 1) =P si (d)

P si (d) +

∑k0l=1 P

shl

(d).

Further the theoretical properties and possible extensions of the recursive for-

mula given in (5.28) are as follows: 1) it is particularly effective in computing as

the model only uses new data for updating without involving the old historical

data and it is one pass algorithm in the sense that each piece of data is only

processed once. Therefore it is especially suitable for real-time learning and big

data learning required for processing the big smart meter data; 2) it does not

require to store the historical data as the model only uses new daily data for

Page 136: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 136

updating and is greatly effective in reducing the need for the data storage and

transformation. It belongs to the recursive learning algorithm for linear model

and therefore its convergence to the true probability with sufficient historical data

is proved in [Lju99] ; 3) the current version of the updating scheme assumes that

the customer behaviour is relatively stable. However in practice, the customer

behaviour might be time varying. A possible extension is to introduce the for-

getting factor in the model updating. The forgetting factor will discount the old

data in updating and therefore the model will catch up the behaviour change

with time. However, how to determine the right forgetting factor needs further

research and substantial numerical experiments, which is part of our future work.

From the above intuitive examples and theoretic sound learning algorithms,

there are three observations:

1. The probability model P si can accurately represent different uncertainty

behaviours from different customers when using appliance s. If a customer

uses a billing minimization software which always uses appliance s at the

cheapest period, then P s1 = 1 and P s

i = 0 for all i > 1. If a customer is

price sensitive (but does not use a billing minimization software), then he

should have a higher P s1 but lower P s

i for all i > 1. If a customer is less

price-sensitive or insensitive, then he should have a more even probability

distribution for different price periods. In short, this is an effective model

to identify customers’ behaviour from the uncertain energy usage signals.

2. For the non-interruptible (e.g. washing machines and dishwashers) and

interruptible appliances (e.g. PHEVs), the computing of the i-th cheapest

time period PT si (d) is different. For a non-interruptible appliance, there

are only up to Ts−Ls + 1 possibilities while for an interruptible appliance,

there are up to CLs

Tspossibilities, i.e. the binomial coefficient of choosing Ls

prices from Ts prices.

3. With the aim to calculate the daily bill of shiftable appliance s for the

customer based on probability model P si , let PT si denote the i-th cheapest

time period for appliance s on one given day and PSsi denote the sum of

the associated prices in the i-th cheapest time period PT si . As a result, the

daily bill of appliance s can be calculated as follows:

Bs =∑i

(PSsi × Es/Ls × P s

i

). (5.29)

Page 137: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 137

It should be noted that the bill for appliance s is in fact the revenue that

the retailer expects to receive from the customer’s usage of appliance s.

Furthermore, for each shiftable appliance s on the given day, the scheduling

window is denoted as Hs. We denote the hourly energy consumption of

appliance s as ys,h, where h ∈ Hs. Then ys,h can be obtained via the

following formula.

ys,h =∑i

(Es/Ls × P s

i × Ihi,s)

(5.30)

where Ihi,s is defined as follows:

Ihi,s =

1 if h ∈ PT si0 if h /∈ PT si

.

5.4.2 Curtailable Appliances

The Learning Model

We assume that each time slot for curtailable appliances is 1 hour. However,

our proposed learning algorithms can be adapted to any sub-hour time slots (e.g.

10 minutes). The scheduling window for appliance c ∈ Cn can be defined as

Hc , ac, ..., bc.We consider the impacts of price and weather conditions on the electricity

demand model of curtailable appliances. Note that the electricity demand of

appliance c at hour h not only depends on the price at hour h but also depends

on the prices at other hours.

Let yc,h(d),p(d) = yc,h(d), [pac(d), ..., pbc(d), Tac(d), ..., Tbc(d)] , d = 1, ..., D

be the available historical input-output data of an unknown demand function,

where the input data p(d) = [pac(d), ..., pbc(d), Tac(d), ..., Tbc(d)] represent the

price and temperature signals during the scheduling window Hc on day d and

the output data yc,h(d) represent the energy consumption of the appliance c at

hour h on day d. We use a linear demand function (i.e. Eq. (5.31)) to represent

the demand model of each appliance for each customer, that is, how a customer

responds to the price and temperature signals when using curtailable appliances.

Page 138: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 138

As the coefficients of such a model are unknown, they need to be learned from his-

torical data. The coefficients of each model are learned or determined by finding

those that minimize the squared differences between the output values predicted

by the model and the actual output values in historical data [Kai68].

As the prices and temperatures are normally changing slowly with time, for

this reason, at the given time, we only need to model the demand around a small

range of price and temperature interval or locally. As any non-linear function

can be approximated well by a linear function locally, this is one of the main rea-

sons that linear demand model is widely used in this research area [KSCdPM00]

[AMY10] and selected in our work.

As a result, the demand model of appliance c at hour h ∈ Hc can be defined

as follows:

yc,h = βc,h,0 + βc,h,acpac + ...+ βc,h,bcpbc+

β′c,h,acTac + β′c,h,bcTbc + εc,h(5.31)

where pac is the electricity price at hour ac, Tac is the temperature at hour ac,

βc,h,0, βc,h,ac , ..., βc,h,bc , β′c,h,ac

, ..., β′c,h,bc are the parameters that need to be learned

and εc,h is the model error.

Furthermore, it will impose constraints on coefficients βc,h,ac , ..., βc,h,bc when

considering the direct price elasticity of demand and cross-price elasticity of de-

mand shown as follows:

Constraint 1 (direct price elasticity of demand): If the price of elec-

tricity at hour h raises, the demand of electricity at this hour falls. That is, the

following inequality holds:

βc,h,h < 0. (5.32)

Constraint 2 (cross price elasticity of demand) : If we define the elec-

tricity in 24 hours as 24 products, the electricity at hour h and the electricity

at some other hour l, h 6= l can be defined as substitutes. The cross elasticity

measures the responsiveness of the demand for the electricity at hour h ∈ Hc

to the change in price at some other hour l ∈ Hc. When the price of electricity

at hour l increases whereas the prices of the other hours are unchanged, some

demand at hour l may be shifted from hour l to hour h due to price increasing

at hour l. As a result, the cross elasticity is positive. That is, for each h, l ∈ Hc,

Page 139: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 139

the following inequality holds:

βc,h,l > 0 if h 6= l. (5.33)

Estimation of the Learning Model

As aforementioned, we need to find the coefficient estimates β = βc,h,0, βc,h,ac , ...,βc,h,bc , β

′c,h,ac

, ..., β′c,h,bc that minimizes the following criterion:

β = arg minβ

S(β) = arg minβ

D∑d=1

(yc,h(d)− βc,h,0

−βc,h,acpac(d)− ...− βc,h,bcpbc(d)− β′c,h,acTac(d)− ...

−β′c,h,bcTbc(d))2.

(5.34)

As a result, the estimation of the learning model is a quadratic programming

problem as follows:

minβ

S(β)

s.t.

βc,h,h < 0

βc,h,l > 0 if h 6= l.

(5.35)

Finally, given the hourly price and temperature information and the corre-

sponding hourly appliance-level demand information, the aim of this learning al-

gorithm is to estimate the coefficients βc,h,0, βc,h,ac , ..., βc,h,bc , β′c,h,ac

, ..., β′c,h,bc and

common optimization techniques [CL96] can be used to solve the above quadratic

programming problem.

5.5 Pricing Optimization for Demand Response

Management

Note that we omitted the subscript n in previous section for notation simplicity.

From this section, we restart using the subscript n in all of the following relevant

mathematical representations. More specifically, the representations Bs and ys,h

will be re-denoted as Bn,s and yn,s,h in this section. In the following, we firstly

define some notation with respect to our proposed learning algorithms. Secondly,

Page 140: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 140

a profit maximization based pricing algorithm for demand response management

is proposed.

5.5.1 Notation

Let ph denote the electricity price offered by the retailer at each hour h ∈ H =

1, 2, ..., H. Usually, H = 24. We denote the hourly energy consumption of

curtailable appliance c as yn,c,h, where h ∈ Hn,c. yn,c,h can be obtained through

the customer behaviour learning model presented in sub-section 5.4.2 (by solving

optimization problem (5.35)). As a result, the daily bill of curtailable appliance

c for customer n can be represented as: Bn,c =∑

h∈Hn,c

ph × yn,c,h.

5.5.2 Pricing Optimization – Problem Formulation

In this subsection, the energy cost model for the retailer will be discussed first,

and then a profit maximization based smart pricing model will be presented.

As the energy consumption of customers for curtailable appliances is not only

dependent on prices but also temperatures, it is important to know the temper-

ature information of next 24 hours beforehand in order to optimize the retail

prices for next day. Due to the fact that the techniques of weather forecasting

are already very mature, we assume the temperature information is available.

We define a cost function Ch(DEh) indicating the cost of providing electricity

by the retailer at each hour h ∈ H, where DEh represents the amount of power

provided to all customers at each hour of the day. We assume that the cost

function Ch(DEh) is convex increasing in DEh for each h [MRWJ+10b] [LCL11].

In view of this, the cost function is designed as follows [MRWJ+10b].

Ch(DEh) = ahDE2h + bhDEh + ch (5.36)

where ah > 0 and bh ≥ 0, ch ≥ 0 at each hour h ∈ H.

For each hour h ∈ H, by defining the minimum price that the retailer (utility

company) can offer pminh and the maximum price pmaxh , we have:

pminh ≤ ph ≤ pmaxh . (5.37)

pminh and pmaxh are usually designed based on historical prices, market com-

petition, customers’ acceptability and the wholesale price. It is reasonable to

Page 141: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 141

assume that the price the retailers can offer is greater than the wholesale price

for each hour, and there exists a price cap for the retail prices due to retail market

competition and regulation. Then pmaxh can be used to represent such a price cap.

Note that there is usually a maximum supply capacity by the retailer or a

maximum load capacity of power networks, denoted as Emaxh , at each hour. Thus,

we have the following constraint:

DEh =∑n∈N

(∑s∈Sn

yn,s,h +∑c∈Cn

yn,c,h) ≤ Emaxh ,∀h ∈ H. (5.38)

Due to the inelasticity of energy use, we add the revenue constraint to improve

the acceptability of the retailer’s pricing strategies, i.e. there exists a total revenue

cap, denoted as Rmax, for the retailer. Without such a constraint, the retail prices

will keep going up to a level which is politically against the government, political

parties, and energy regulators as well as financially unacceptable to the customers.

As a result, we have the following constraint:

R =∑n∈N

(∑s∈Sn

Bn,s +∑c∈Cn

Bn,c) ≤ Rmax. (5.39)

Moreover, constraint (5.37) could be replaced by a constraint where the aver-

age price sent out by the retailer is kept constant and equal to a predetermined

value over a period of time (e.g. 24 hours) to ensure that customers are offered a

sufficient number of low-price periods [ZMPM13].

Finally, the profit maximization problem for the retailer to optimize the hourly

prices of next 24 hours can be modelled as follows:

maxph

R−

∑h∈H

Ch(DEh)

subject to constraints (5.37), (5.38), (5.39).

(5.40)

5.5.3 Solution Algorithm

In this subsection, we adopt GA based distributed pricing algorithms to solve the

profit maximization problem based on customers behaviour learning results.

Due to the fact that yn,s,h is a step (discontinuous) function in ph, in which

the even more complicated issue is that this discontinuous function is dependent

on the i-th cheapest time period PT si . Although PT si is a function of ph, such

a dependent relationship between yn,s,h and ph is very complicated and it seems

Page 142: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 142

Algorithm 5 GA based pricing algorithm to Eqs.(5.40) executed by the retailer

1: Population Initialization, i.e. generating a population of PN chromosomesrandomly; each chromosome denotes a strategy of the retailer.

2: for i=1 to PN do3: The retailer announces strategy i, i.e. announces 24-hour prices by decod-

ing the ith chromosome;4: Receive the response (i.e. the hourly energy consumption and daily bill

payment information) of each customer n to strategy i.5: Check the feasibility of strategy i to see if it satisfies all the constraints (5.37

- 5.39). If not, handle the invalid individuals by the approach proposedin [Deb00]. Then, obtain the fitness value of strategy i.

6: end for7: A new generation of chromosomes is created by using the selection, crossover

and mutation operations of the genetic algorithm.8: Steps 2 - 7 are repeated until the stopping condition is reached.9: The retailer announces the finalized price vector to the smart meters (cus-

tomers) via the two-way communication infrastructure at the beginning ofthe scheduling horizon.

Algorithm 6 Response to price signals based on the learning results executedby each smart meter

1: Receive the price information from the retailer.2: The smart meter calculates the hourly energy consumption and daily bill

payment of each appliance in the household based on learning results of theproposed learning models in Section 5.4.

3: The smart meter sends back the total energy consumption at each hour andthe total expected daily bill to the retailer via the two-way communicationinfrastructure.

extremely difficult or impossible to give an analytic expression to represent such

a relationship. As a result, the conventional nonlinear optimization methods are

not usable and GA [Hol92] based pricing algorithms are proposed to solve this

problem.

In our genetic algorithms, binary encoding and deterministic tournament

selection without replacement is adopted [BT96]. For the crossover and mu-

tation operations, we employ uniform crossover and bit flip mutation respec-

tively [Sys89] [RMG00]. The constraints are handled by the approach proposed

in [Deb00].

Finally, the GA based distributed decision-making algorithms for demand

response management are shown as Algorithm 5 and 6 respectively. Before the

Page 143: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 143

pricing optimization process starts at the beginning of each day, the learning

models will be updated according to previous day’s prices and electricity usages

and the newly generated learning results will overwrite the old learning results.

In Algorithm 5, step (1) initializes the population for the GA. Steps (2 - 6),

showing the feasibility and fitness evaluation procedure for the GA, are explained

as follows: firstly, the retailer sends next prices of next 24 hours to customers

(step 3). Meanwhile, each smart meter (customer) reacts to the price signals by

calculating the hourly energy consumption and daily bill payment information of

each appliance (Algorithm 6: step 2) and the energy consumption and daily bill

information is sent back to the retailer via the two way communication infras-

tructure (Algorithm 6: step 3). After receiving the response of each customer n

(step 4), the retailer then conducts the feasibility and fitness evaluations (step 5).

A new generation of chromosomes are created via the GAs operations (selection,

crossover and mutation), which can be found in step (7). After the stopping

condition is reached, the retailer will announce the finalized price vector to each

customer, which is illustrated in step (9). At the end, the optimal day-ahead

pricing strategy is found for the retailer.

As the learning models are embedded into the customers’ smart meters, these

learning algorithm units conduct the customer behaviour analysis distributedly

in each smart meter and are not visible to the retailer. In addition, our pro-

posed distributed optimization algorithms can implement the pricing optimiza-

tion without intruding the customers’ privacy. This is based on the assumption

that there exists a third regulation party between the retailer and the smart me-

ters. Each customer’s expected energy consumption data is firstly uploaded to

the third party and processed there. The third party only pass the aggregated

consumption data of a whole region or an abstract model to the retailer for pricing

optimization. That is, the retailer does not have direct access to each individual

customer’s data.

5.5.4 Benefits to the Retailer and its Customers

The proposed learning models and pricing optimization models can benefit

both the retailer and the customers.

On the one hand, the learning outcome of our proposed learning models can

increase a customer’s understanding on how often their usages are in the cheaper

Page 144: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 144

periods and less cheaper periods. In this way, a customer is able to identify the

potential saving by switching to the cheaper periods. If the customer takes the

action to change his/her usage pattern, the probability models will be updated as

it is a recursive updating model. Then the customer can understand how much

improvement he/she has achieved and what further actions can be taken, until

reaching the usage patterns where the cheapest operation periods are selected in

most of the time.

On the other hand, as the retailer’s optimization model is a day-ahead pricing

model, once the customers change their usage patterns, the retailer’s optimal

prices will be updated accordingly based on the profit maximization model to

reflect the changes.

As a result, both parties (retailer and the customers) will benefit from the

proposed learning and pricing models.

5.6 Numerical Results

In this section, we will firstly evaluate our proposed learning models based on

historical data and the learning results will be stored locally in the smart meter.

Further, a distributed pricing optimization model is implemented to obtain opti-

mal day-ahead prices by interacting with locally stored learning results and one

case study is implemented to evaluate the effectiveness of our proposed pricing

model.

5.6.1 Learning Algorithms Evaluation

In the following, two proposed learning algorithms for shiftable appliances and

curtailable appliances will be evaluated respectively.

Shiftable Appliances

We use the models in [MZ13] and [CKS11] to generate data to test our proposed

learning algorithm for shiftable appliances (interruptible and non-interruptible

appliances respectively).

The model in [MZ13] is a payment minimization problem and linear program-

ming can be adopted to solve the problem. Since this model allows intermittent

operations of appliances, [MZ13] can be used to generate data for interruptible

Page 145: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 145

appliances such as PHEVs. As [MZ13] only considers the financial cost in the ob-

jective function, data generated with this model will fit the energy consumption

patterns of price-sensitive customers.

In addition to considering the customers’ payment, [CKS11] considers the

comforts of customers by adding a penalty of waiting to the objective function.

This model aims to obtain the optimal starting time for each appliance to max-

imize customers’ payment bills. Since model [CKS11] not only considers the

financial cost but also the life comfort, data generated with this model are more

likely to fit the energy consumption patterns of price-insensitive customers or

price mid-sensitive customers.

We implement the evaluations on 4 appliances (dish washer, washing machine,

clothes dryer and PHEV). We use the actual dynamic day-ahead price data from 1

January 2012 to 21 December 2012 of ISO New England [ISO12]. The parameter

settings of these two models are shown as Table 5.3 and Table 5.4. In Table 5.3,

Hs represents the scheduling window for appliance s where the starting time is the

earliest possible time to switch on the appliance and the ending time is the latest

possible time to switch off the appliance. γmins and γmaxs stand for the minimum

and maximum hourly energy consumption of each interruptible appliance. For

each non-interruptible appliance, Ps in Table 5.4 means the hourly inconvenience

cost incurred by delaying the operation of the appliance to a later (cheaper) price

period.

According to the data generated, the learning results for shiftable appliances

are shown as Figure 5.1.

When looking at the learning result of PHEV, we can find that the customer

uses the appliance in the cheapest period (rank 1) with a probability of 1. In

other words, this customer is very price-sensitive and always switches on the

appliance in the cheapest period. In addition to the payment consideration,

learning results of dish washer, washing machine and clothes dryer reveal that

the energy consumption patterns of these customers are possibly influenced by

other factors such as less waiting time (life comforts). Note that the learning

results under model [CKS11] are highly dependent on the value of inconvenience

cost Ps and the price signals. As a result, the probability distribution in Figure

5.1 is actually reflecting the customer’s behaviour patterns under the given input

data.

Page 146: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 146

Table 5.3: Parameters for each interruptible appliance

Appliance Name Es Hs γmins γmaxs Ls

PHEV 9.9kwh 8PM-8AM 0kwh 3.3kwh 3hrs

Table 5.4: Parameters for each non-interruptible appliance

Appliance Name Es Hs Ps Ls

Dishwasher 1.8kwh 8PM-2AM 0.3$/hr 2hrs

Washing machine 3.4kwh 8AM-10PM 0.3$/hr 2hrs

Clothes dryer 3.4kwh 7PM-6AM 0.15$/hr 2hrs

0 5 10 15 20 250

0.1

0.2

0.3

0.4Dish washer

Rank

Pro

babi

lity

0 20 40 60 80 1000

0.1

0.2

0.3

0.4Washing machine

Rank

Pro

babi

lity

0 20 40 600

0.05

0.1

0.15

0.2Clothes dryer

Rank

Pro

babi

lity

10

0.2

0.4

0.6

0.8

1PHEV

Rank

Pro

babi

lity

Figure 5.1: Learning results under model [MZ13] and [CKS11].

Curtailable Appliances

Based on the hourly price and temperature information from 1 January 2012 to 21

December 2012 of ISO New England [ISO12], we generate the energy consumption

data of curtailable appliances in response to the above price and temperature

signals using a fuzzy-logic system [KZE02].

In Chapter 3 and 4, we assume that either the retailer knows the customers’

responses (Chapter 3) or the retailer is able to identify the customers’ responses

via the two-way communication between the retailer and the home energy man-

agement system in the smart meters (Chapter 4). As a result, the evaluations of

Page 147: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 147

these two models can be implemented via setting the model parameters manually.

However, in this model (Chapter 5), the retailer does not know the customers’

consumption patterns and has to learn them via history data. As there are no

available dataset to evaluate our proposed learning model for curtailable appli-

ances, we adopt the fuzzy-logic system to assist us in generating some data.

Note that the fuzzy-logic system is not essential for this work but we just need

a method allowing us to generate some reasonable and sensible data for testing

our proposed models and the fuzzy logic method fits to this purpose.

In this thesis, we use a fuzzy logic system simulating the energy consump-

tion data of the space-heater. Our adopted fuzzy-logic system uses fixed rules

and membership functions. We use a Mamdani type model in this proposed

fuzzy-logic system and the centroid method is adopted for defuzzification [MA75]

[Lee90].

The fuzzy membership functions of space heater are shown in Figure 5.2. Let

P denote Price, T denote Temperature and U denote Usage. Then the fuzzy

rules of space heater are shown as follows:

• If (P is low) and (T is low) then (U is much-high)

• If (P is low) and (T is average) then (U is little-high)

• If (P is low) and (T is high) then (U is average)

• If (P is average) and (T is low) then (U is little-high)

• If (P is average) and (T is average) then (U is average)

• If (P is average) and (T is high) then (U is little-low)

• If (P is high) and (T is low) then (U is average)

• If (P is high) and (T is average) then (U is little-low)

• If (P is high) and (T is high) then (U is much-low)

After generating the energy consumption data, to measure the accuracy of

the proposed learning model, we use the measures of Root Mean Square Errors

(RMSE) and Mean Absolute Percentage Errors (MAPE).

We firstly divide the dataset into training dataset and testing dataset and

we assume that the space heater is ‘on’ for 4 hours everyday from 8PM-12AM

for simulation purposes. However, the above assumption can be extended to any

scenario.

The RMSE and MAPE at each hour from 8PM-12AM are very small and

can be found in Table 5.5, which indicates a very high accuracy of the proposed

learning model. Furthermore, the estimated direct price elasticity of demand

Page 148: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 148

Table 5.5: Error Measurements of Learning

Measure Type 8-9PM 9-10PM 10-11PM 11PM-12AMRMSE 0.0884 0.0845 0.0872 0.0848

MAPE 2.11% 1.96% 2.63% 1.99%

0 2 4 6 8 10 12 14 16 18 200

0.5

1

Price (cents)

low average high

−15 −10 −5 0 5 10 15 20 25 300

0.5

1

Temperature (C)Deg

ree

of m

embe

rshi

p

low average high

0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.5

1

Usage (KW)

much−low

little−low average little−high much−high

Figure 5.2: Fuzzy membership functions of space heater.

and cross price elasticity of demand (β values) from 8PM to 12AM are shown as

Figure 5.3. Moreover, Figure 5.4 shows the actual demand and forecast demand

of space heater from 8PM to 12AM (the next day) while Figure 5.5 shows the

residuals of the forecasting.

In the following, we demonstrate an application of the proposed learning mod-

els in pricing optimization for demand response management.

Page 149: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 149

8−9 PM 9−10 PM 10−11 PM 11−12 AM−0.5

0

0.5

1

1.5

2

2.5

3

3.5

4

hrs

β

Figure 5.3: Estimated price elasticities of demand from 8PM to 12AM.

0 50 1002.5

3

3.58PM−9PM

Days

Usa

ge(K

W)

ActualForecast

0 50 1002.5

3

3.59PM−10PM

Days

Usa

ge(K

W)

ActualForecast

0 50 1002.5

3

3.510PM−11PM

Days

Usa

ge(K

W)

ActualForecast

0 50 1002.5

3

3.511PM−12AM

Days

Usa

ge(K

W)

ActualForecast

Figure 5.4: Actual demand and forecast demand of space heater from 8PM to12AM.

Page 150: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 150

0 50 100−0.2

0

0.2

0.4

0.68PM−9PM

Days

Res

idua

l

0 50 100−0.2

0

0.2

0.4

0.69PM−10PM

Days

Res

idua

l

0 50 100−0.2

0

0.2

0.4

0.610PM−11PM

Days

Res

idua

l

0 50 100−0.2

0

0.2

0.4

0.611PM−12AM

Days

Res

idua

l

Figure 5.5: Residuals of the forecasting of space heater from 8PM to 12AM.

5.6.2 Pricing Optimization

We simulate a neighbourhood consisting of 100 customers served by one energy

retailer. It is assumed that each customer has 5 appliances: PHEV, dishwasher,

washing machine, clothes dryer and space heater. The scheduling horizon is set

from 8AM to 8AM (the next day). We assume that the customers are homoge-

neous, i.e. En,a = Ea, Hn,a = Ha, γminn,a = γmina , etc., where a stands for each

home appliance. Further, we assume that the retailer cannot identify such cus-

tomers’ utility functions, as a result, learning algorithms are needed to determine

the customers’ energy consumption patterns.

The evaluation of our proposed pricing optimization model is conducted in

the following steps: firstly, learn the customers’ energy consumption patterns

from the hourly prices and temperature data between 1 January 2012 and 21

December 2012 of ISO New England [ISO12]; secondly, optimize the 24 hours

prices for next day (22 December 2012) based on the learning results; thirdly,

compare the profits and revenues under original 24 hours prices on 22 December

2012 with those under the optimized prices.

For the cost of energy provided to customers by the retailer, we model it as a

cost function shown as Eq.(5.36). We assume that bh = 0, ch = 0 for all h ∈ H.

Page 151: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 151

Table 5.6: Parameter Settings for Pricing Optimization

Parameter Setting Valueah The coefficient of electricity cost function is set to 10−4 multiplied

by the result of original prices in the next day minus a constant; inour experiment, the constant is set to 2.5.

pminh The minimum price is set to original prices in the next day minus aconstant; in our experiment, the constant is set to 2.5.

pmaxh The maximum price is set to original maximum price of next day inthe dataset (22 December, 2012).

Emaxh The hourly maximum electricity consumption is set to maximum

amount of electricity consumed in the next day from original data(22 December, 2012).

Rmax In order to compare the profit, the maximum revenue is set to therevenue achieved in the next day from original data (22 December,2012).

Table 5.7: Parameter settings of GA

Parameter Name Symbol Values

Chromosome Length Lg 10Population Size PN 150

Mutation Probability Pg 0.005Terminate Generation Tg 500

The parameter setting for ah can be found in Table 5.6.

Furthermore, other parameter settings for conducting the pricing optimization

evaluations are given in Table 5.6 and the parameters of the proposed genetic

algorithms are shown in Table 5.7.

The GAs based pricing optimization algorithm converges at around 240th

generation, which can be seen in Figure 5.6 and the optimized 24 hours prices

are shown as Figure 5.7. The total energy consumption of the customers under

optimized hourly prices and original hourly prices are given as Figure 5.8.

From Figure 5.7, we can see that the optimized prices during peak times

(8PM-12AM) are much higher than the off-peak times (1AM - 6AM), which are

more reasonable than original prices. As a result, the energy consumption of

customers are shifted from peak times to off-peak times, which can be found in

Figure 5.8. Furthermore, the details of revenue and profit under optimized prices

and original prices can be found in Figure 5.9. Note that the revenues and profits

Page 152: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 152

shown in Figure 5.9 and the following Figure 5.10 are actually expected revenues

and profits.

From Figure 5.9, we can see that, to make the same revenue ($176.47) which

means the total bill from all customers under optimized prices is the same as that

under original prices, the profit of the retailer under optimized prices ($80.67) is

higher than the profit under original prices ($73.53). This example shows an

important potential for the proposed pricing optimization model: it can increase

the retailer’s profit without increasing customers’ expenses.

To test the stability of our proposed learning models and pricing optimization

models, we repeat the above simulation process for 7 times to show the daily

revenues and prices for the following one week which are illustrated in Figure

5.10. By using our proposed pricing optimization algorithms, it can achieve an

11.08% increase in profit on average for the retailer.

0 100 200 300 400 50066

68

70

72

74

76

78

80

82

generation

fitne

ss v

alue

Figure 5.6: Convergence speed of the proposed genetic algorithm.

Page 153: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 153

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM2

3

4

5

6

7

8

9

10

11

12

Hours

Ele

ctric

ity P

rice

(cen

ts)

Maximum PricesMinimum PricesOriginal PricesOptimized Prices

Figure 5.7: Comparison between optimized prices and original prices.

8AM 11AM 2PM 5PM 8PM 11PM 2AM 5AM0

100

200

300

400

500

600

Hours

Tot

al E

lect

ricity

Con

sum

ptio

n (k

wh)

Under Optimized PricesUnder Original Prices

Figure 5.8: Energy consumption under optimized prices and original prices.

Page 154: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 154

Revenue Profit0

2000

4000

6000

8000

10000

12000

14000

16000

18000

Hours

Cen

ts

Under Optimized PricesUnder Original Prices

Figure 5.9: Profit and revenue under optimized prices and original prices.

23/12/2012 25/12/2012 27/12/2012 29/12/201220

40

60

80

100

120

140

160

180

200

Day

Dol

lors

($)

Revenues under optimized pricesRevenues under original pricesProfits under optimized pricesProfits under original prices

Figure 5.10: Profits and revenues under optimized prices and original prices overone week.

Page 155: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 5. SMART PRICING TO DEMAND RESPONSE III 155

5.7 Chapter Summary

In this chapter, we propose a machine learning based framework to smart pricing

for the demand response management where two learning algorithms are proposed

to learn behaviour patterns of customers in using different types of appliances.

The first learning algorithm designed for shiftable appliances aims to obtain the

probability distribution of different energy consumption patterns in response to

the dynamic day-ahead prices. The second learning algorithm proposed for cur-

tailable appliances aims to predict the hourly energy consumption in response to

price and temperature signals. To demonstrate the applications of our proposed

customer behaviour learning models, a genetic algorithm based pricing optimiza-

tion model for demand response management has been proposed for the retailer

with the aim to maximize its profit. Numerical results indicate the effectiveness

of our proposed pricing optimization model and thus the benefits of our proposed

customer behaviour learning models.

Page 156: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Chapter 6

Conclusions and Future Work

This thesis studies smart pricing methods to demand response management based

on game-theory and machine learning techniques for the smart grid. It includes

developing home energy management systems for customers, constructing an en-

ergy customer behaviour learning framework and building smart pricing strategies

for energy retailers.

We concentrate on studying the interactions between the energy retailer and

its customers from the following three aspects:

• The home energy management problems for customers and the dynamic

pricing problem for the retailer are firstly investigated in Chapter 3. Further,

the interactions between the retailer and its customers are modelled through a

leader-follower Stackelberg game. This chapter can be seen as our first attempt

to solve smart pricing based demand response management problems. As the

problems considered in this chapter are small scale, we proposed a KKT condition

based solution to solve the Stackelberg game model effectively. Although it is

very difficult to solve large-scale demand response management problems in the

proposed solution concept (KKT condition based solution), however, the study

implemented in Chapter 3 paved the way for our later studies and encouraged us

to explored feasible and efficient solution approaches for large-scale problems.

• Chapter 4 can be seen as a great improvement on problem modelling and

solution approaches over Chapter 3. In Chapter 4, a more comprehensive and

complete home energy management system including most commonly used types

of home appliances, most possible types of applications and an efficient and easy-

to-use waiting time cost model is firstly proposed for the customers. Secondly, an

improved smart pricing based demand response management model is proposed

156

Page 157: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 6. CONCLUSIONS AND FUTURE WORK 157

for the retailer. The interactions between the retailer and its customers are mod-

elled as a bilevel optimization problem. As the problems at both the customer-

side and retailer-side are large scale, multi-population genetic algorithms based

distributed algorithms are proposed to solve the bilevel problem efficiently. From

the convergence analysis and Figure 4.4, we can see that with the increase of cus-

tomer number, the convergence speed of the multi-population genetic algorithms

based distributed algorithms does not increase much. Although we are not able

to implement the simulations scaled up to a higher number of customers (e.g., a

million customers) due to the computer hardware resource limitations, we can ob-

serve from the existing results that our proposed distributed algorithms are very

promising in solving large-scale demand response problems. This observation will

encourage us to do more work in the future.

• Different from the models presented in Chapters 3 and 4 where the home

energy management systems (HEMS) are assumed to be embedded in the smart

meters, in Chapter 5, we consider another scenario where there are no HEMS

installed in the smart meters. As a result, the retailer does not know customers’

energy consumption behaviours and therefore needs to learn them via historical

usage data. According to the load types, we firstly categorize the home appliances

into shiftable appliances and curtailable appliances and then propose two corre-

sponding appliance-level learning models. Finally, we propose genetic algorithms

based distributed algorithms to solve the smart pricing based demand response

problem faced by the retailer.

From the above three aspects of the smart pricing based demand response

study, we are able to identify the following contributions of this research.

6.1 Summary of Contributions

6.1.1 Stackelberg Game and Bilevel Optimization based

Demand Response Management

The following contributions have been made in the Stackelberg game and bilevel

optimization based Demand Response Management studies that form Chapters

3 and 4.

Page 158: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 6. CONCLUSIONS AND FUTURE WORK 158

• From customers’ point of view, the proposed approach models all possi-

ble categories of the home appliances including interruptible appliances, non-

interruptible appliances and curtailable appliances, and can be easily realized

by the home energy management software system integrated into smart meters.

For curtailable appliances, which are not considered or modelled in the existing

literature, our approach considers them and proposes the corresponding benefit

maximization models. For interruptible appliances and non-interruptible appli-

ances, our approach proposes a realistic and user-friendly waiting time cost model

which can be set up easily by an ordinary customer. Further, our approach uses

the utility function based on the real home appliances rather than theoretic and

abstract household utility functions, and therefore can be used for customers to

find the best usage and scheduling scheme to minimize their bill or maximize

their benefits.

• From retailer’s point of view, the proposed approach models the pricing

optimization problem faced by a retailer, rather than the simplified or unrealistic

(theoretic) pricing optimization problem. As there are HEMS embedded in the

smart meters in this part of study, the dynamic pricing determination problem

faced by the retailer can be seen as a Stackelberg game or bilevel optimization

problem. We further propose efficient solution methods to solve the Stackelberg

game or the bilevel problem.

• From the pricing optimization method point of view, we propose a KKT

condition based solution method for small-scale problems. Further, a hybrid op-

timization approach with multi-population genetic algorithms for the upper level

problem and an individual optimization algorithm for each lower level problem

is proposed for large-scale problems. Further, the existences of optimal solutions

to the Stackelberg game and the bilevel model are proved to ensure that the

proposed solution methods are built on a sound theoretic foundation.

6.1.2 Learning based Demand Response Management

For customers without HEMS embedded in the smart meters, the retailer does

not know customers’ behaviours. To overcome this, we propose two appliance-

level machine learning algorithms to learn the customers’ energy consumption

patterns. The following contributions have been made in the learning based

Demand Response Management study that forms Chapter 5.

• From customers’ point of view, we focus on the individual usage pattern

Page 159: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 6. CONCLUSIONS AND FUTURE WORK 159

modelling rather than the aggregated usage pattern modelling. In addition, our

work, which learns the usage pattern in response to the dynamic prices and tem-

perature signals from historical data, is fundamentally different from the existing

research in individual modelling where they assumed the existence of a home

energy management system and therefore that the usage patterns can be derived

from the optimization of home energy management problems.

• According to the load types, we categorize the home appliances into shiftable

appliances and curtailable appliances and propose two appliance-level behaviour

learning models. For shiftable appliances, we propose a probabilistic recursive

learning model to learn the energy consumption patterns of customers in response

to the price signals. For curtailable appliances, we present a price-demand model

to predict the hourly energy consumption of customers in response to the price

and temperature signals.

• From retailer’s point of view, the demand response management model op-

timizes the day-ahead prices based on customer behaviour learning models. This

model is different from deterministic pricing models as it deals with uncertain-

ties in customers’ energy consumption patterns. In order to solve the model for

the retailer effectively, we further propose GA based distributed pricing solution

algorithms.

• The proposed learning models can greatly benefit the retailer and its cus-

tomers. For example, at the customer-side, by the proposed probabilistic learn-

ing model, the customer is able to identify the potential savings by switching

to cheaper periods. If the customer takes the action to change his/her usage

patterns, the probabilistic model will be updated as it is a recursive updating

model. Then the customer can understand how much improvement he/she has

achieved and what further actions can be taken, until reaching the usage patterns

where the cheapest operation periods are selected in most of the time. On the

other hand, as the retailer’s optimization model is a day-ahead pricing model,

once customers change their usage patterns, the retailer-side pricing model will

be updated accordingly to reflect the changes.

6.2 Future Work

Although the project fulfils the aims of developing efficient smart pricing strate-

gies to demand response management for the smart grid via game theory and

Page 160: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 6. CONCLUSIONS AND FUTURE WORK 160

machine learning techniques, there is still some work that can be developed in

the future.

• Firstly, as our proposed demand response models via game theory and ma-

chine learning techniques in this thesis assume that there are smart meters in-

stalled in the households, one future work is to focus on customer behaviour

learning for demand response management problems where smart meters are not

deployed in the customers’ houses. As a result, the retailer has to learn the

customers’ energy consumption patterns from the aggregated demand data via

machine learning algorithms.

• Secondly, the retailer may serve a pool of customers with some customers

having home energy management systems (HEMS) installed in their smart me-

ters, some customers without HEMS installed in the smart meters and others

even without smart meters in their houses, how to effectively model energy con-

sumption patterns of such a customer pool and therefore determine retail prices

for these customers is also one of our future considerations.

• Thirdly, another future work will focus on proposing a differential pricing

model for the retailer that is different from the existing unified pricing strategies.

As different types of customers (e.g. price-sensitive customers, price-insensitive

customers) may have different energy consumption patterns, different pricing

strategies can be offered to different customers to incentivise them to participate

into potential demand response programs.

• Fourthly, in our considered demand response management models, there is

only one retailer which serves multiple customers. In our future work, we will con-

sider a different scenario where there are multiple retailers competing with each

other to maximize their profits. When there are multiple retailers, the price set

by a retailer will not only depend on its own available power but also on the prices

of other retailers. After the the retailers announce the prices, the customers will

react to the prices by optimally adjusting their energy consumption to maximize

their benefits. As the above is a sequential decision-making process for retailers

and their customers, we can use a multi-leader, multi-follower Stackelberg game

to model the interactions between retailers and their customers. The retailers

will play a non-cooperative game with each other to achieve a Nash equilibrium

point and the customers will react independently by deciding their own energy

consumption in response to the announced prices.

Page 161: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

CHAPTER 6. CONCLUSIONS AND FUTURE WORK 161

• Fifthly, we will consider integrating electric vehicles and self-production fa-

cilities such as Photovoltaic (PV) into our demand response model. As a new

type of residential load, electric vehicles (EVs) can provide different opportuni-

ties as a home storage unit [EPM+14]. The customers can charge the EVs at

low-price periods and sell back the energy to the grid (Vehicle-to-grid, V2G) at

high-price periods. Also, the customers can use the excessive power in the EV

battery to supply the household loads (Vehicle-to-home, V2H) to help reduce the

peak electricity demand, which can benefit both the customers and the retailers.

Further, with the self-production facility such as Photovoltaic (PV) available in

households, PV can be used to generate power to supply the household loads or

sell back to the grid.

• Finally, we will consider systems integration of energy supply and demand

by creating one framework and providing analytical tools for the whole energy

systems. At the supply-side, there are continuous electricity production (e.g. oil,

coal and nuclear) and intermittent electricity production (e.g. wind and solar

PV). At the demand-side, there are various types of electricity demand from

transport, industry, commercial and residential sectors. The key question is how

can we achieve an effective energy systems integration and an efficient demand

and supply matching?

Page 162: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

Bibliography

[AMY10] HA Aalami, M Parsa Moghaddam, and GR Yousefi. Modeling and

prioritizing demand response programs in power markets. Electric

Power Systems Research, 80(4):426–435, 2010.

[AW14] Christopher O Adika and Lingfeng Wang. Autonomous appliance

scheduling for household energy management. Smart Grid, IEEE

Transactions on, 5(2):673–682, 2014.

[BGM+11] Mario Berges, Ethan Goldman, H Scott Matthews, Lucio Soibel-

man, and Kyle Anderson. User-centered nonintrusive electricity

load monitoring for residential buildings. Journal of Computing in

Civil Engineering, 25(6):471–480, 2011.

[BN13] Hassan A Bashir and Richard S Neville. Hybrid evolution-

ary computation for continuous optimization. arXiv preprint

arXiv:1303.3469, 2013.

[BT96] T. Blickle and L. Thiele. A comparison of selection schemes used in

evolutionary algorithms. Evolutionary Computation, 4(4):361–394,

1996.

[BT11] George EP Box and George C Tiao. Bayesian inference in statis-

tical analysis, volume 40. John Wiley & Sons, 2011.

[BV04] Stephen Boyd and Lieven Vandenberghe. Convex optimization.

Cambridge university press, 2004.

[BY13] Shengrong Bu and F Richard Yu. A game-theoretical scheme in

the smart grid with demand-side management: Towards a smart

cyber-physical power infrastructure. Emerging Topics in Comput-

ing, IEEE Transactions on, 1(1):22–32, 2013.

162

Page 163: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

BIBLIOGRAPHY 163

[BYL11] S. Bu, F. Richard Yu, and Peter X. Liu. A game-theoretical

decision-making scheme for electricity retailers in the smart grid

with demand-side management. In 2011 IEEE International Con-

ference on Smart Grid Communications (SmartGridComm), pages

387–391, 2011.

[CAC09] Miguel Carrion, Jose M Arroyo, and Antonio J Conejo. A bilevel

stochastic programming approach for retailer futures market trad-

ing. Power Systems, IEEE Transactions on, 24(3):1446–1456,

2009.

[CCMGB06] Antonio J Conejo, Enrique Castillo, Roberto Minguez, and Raquel

Garcia-Bertrand. Decomposition techniques in mathematical pro-

gramming: engineering and science applications. Springer Science

& Business Media, 2006.

[CCTL12] Hsueh-Hsien Chang, Kun-Long Chen, Yuan-Pin Tsai, and Wei-Jen

Lee. A new measurement method for power signatures of nonin-

trusive demand monitoring and load identification. Industry Ap-

plications, IEEE Transactions on, 48(2):764–771, 2012.

[CCYZ14] Bo Chai, Jiming Chen, Zaiyue Yang, and Yan Zhang. Demand

response management with multiple utility companies: A two-level

game approach. Smart Grid, IEEE Transactions on, 5(2):722–731,

2014.

[CF95] AJ Chipperfield and PJ Fleming. The matlab genetic algorithm

toolbox. In Applied control techniques using MATLAB, IEE Col-

loquium on, pages 10–1. IET, 1995.

[CG07] Herminia I Calvete and Carmen Gale. Linear bilevel multi-follower

programming with independent followers. Journal of Global Opti-

mization, 39(3):409–417, 2007.

[CHF03] Jeffery K Cochran, Shwu-Min Horng, and John W Fowler. A multi-

population genetic algorithm to solve multi-objective scheduling

problems for parallel machines. Computers & Operations Research,

30(7):1087–1102, 2003.

Page 164: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

BIBLIOGRAPHY 164

[CKS11] C. Chen, S. Kishore, and L.V. Snyder. An innovative rtp-based

residential power scheduling scheme for smart grids. In Acoustics,

Speech and Signal Processing (ICASSP), 2011 IEEE International

Conference on, pages 5956–5959. IEEE, 2011.

[CL96] Thomas F Coleman and Yuying Li. A reflective newton method for

minimizing a quadratic function subject to bounds on some of the

variables. SIAM Journal on Optimization, 6(4):1040–1058, 1996.

[CLCL13] Hsueh-Hsien Chang, Lung-Shu Lin, Nanming Chen, and Wei-

Jen Lee. Particle-swarm-optimization-based nonintrusive demand

monitoring and load identification in smart meters. Industry Ap-

plications, IEEE Transactions on, 49(5):2229–2236, 2013.

[CMS05] Benoıt Colson, Patrice Marcotte, and Gilles Savard. Bilevel pro-

gramming: A survey. 4OR, 3(2):87–107, 2005.

[Com12] Federal Energy Regulatory Commission. Assessment of

demand response and advanced metering staff report.

http://www.ferc.gov/legal/staff-reports/12-20-12-demand-

response.pdf, December 2012. Accessed 20 February 2013.

[CYG12] Jiang Chen, Bo Yang, and Xinping Guan. Optimal demand re-

sponse scheduling with stackelberg game approach under load un-

certainty for smart grid. In Smart Grid Communications (Smart-

GridComm), 2012 IEEE Third International Conference on, pages

546–551. IEEE, 2012.

[Deb00] Kalyanmoy Deb. An efficient constraint handling method for ge-

netic algorithms. Computer methods in applied mechanics and en-

gineering, 186(2):311–338, 2000.

[EPM+14] Ozan Erdinc, Nikolaos G Paterakis, Tiago DP Mendes, Anasta-

sios G Bakirtzis, and Joao PS Catalao. Smart household opera-

tion considering bi-directional ev and ess utilization by real-time

pricing-based dr. Smart Grid, IEEE Transactions on, 2014.

[FAM81] Jose Fortuny-Amat and Bruce McCarl. A representation and eco-

nomic interpretation of a two-level programming problem. Journal

of the operational Research Society, pages 783–792, 1981.

Page 165: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

BIBLIOGRAPHY 165

[FMXY12] Xi Fang, Satyajayant Misra, Guoliang Xue, and Dejun Yang.

Smart grid – the new and improved power grid: A survey. Com-

munications Surveys & Tutorials, IEEE, 14(4):944–980, 2012.

[FT91] Drew Fudenberg and Jean Tirole. Game theory. MIT Press, 1991.

[GCBK12] Vicenc Gomez, Michael Chertkov, Scott Backhaus, and Hilbert J

Kappen. Learning price-elasticity of smart consumers in power

distribution systems. In Smart Grid Communications (SmartGrid-

Comm), 2012 IEEE Third International Conference on, pages 647–

652. IEEE, 2012.

[Gri] National Grid. A journey through time.

http://www.nationalgrid75.com/timeline. Accessed 12 February

2015.

[Har92] George William Hart. Nonintrusive appliance load monitoring.

Proceedings of the IEEE, 80(12):1870–1891, 1992.

[Her07] Karen Herter. Residential implementation of critical-peak pricing

of electricity. Energy Policy, 35(4):2121–2130, 2007.

[HNG+13] JRM Hosking, R Natarajan, S Ghosh, S Subramanian, and

X Zhang. Short-term forecasting of the daily load curve for residen-

tial electricity usage in the smart grid. Applied Stochastic Models

in Business and Industry, 29(6):604–620, 2013.

[Hol92] John H Holland. Genetic algorithms. Scientific american,

267(1):66–72, 1992.

[HP08] Junqiao Han and Mary Ann Piette. Solutions for summer electric

power shortages: Demand response and its applications in air con-

ditioning and refrigerating systems. Lawrence Berkeley National

Laboratory, 2008.

[Hyd15] BC Hydro. British columbia hydro conservation electricity

rates. https://www.bchydro.com/accounts-billing/rates-energy-

use/electricity-rates/residential-rates.html, December 2015. Ac-

cessed 07 December 2015.

Page 166: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

BIBLIOGRAPHY 166

[ISO12] ISONewEngland. Hourly zonal information. http://iso-

ne.com/markets/hstdata/znl info/hourly/index.html, December

2012. Accessed 1 March 2013.

[Jer85] Robert G Jeroslow. The polynomial hierarchy and a simple model

for competitive analysis. Mathematical programming, 32(2):146–

164, 1985.

[JT13] Liyan Jia and Lang Tong. Day ahead dynamic pricing for de-

mand response in dynamic environments. In Decision and Control

(CDC), 2013 IEEE 52nd Annual Conference on, pages 5608–5613.

IEEE, 2013.

[Kai68] Thomas Kailath. An innovations approach to least-squares

estimation–part i: Linear filtering in additive white noise. Au-

tomatic Control, IEEE Transactions on, 13(6):646–655, 1968.

[Kil10] Sila Kiliccote. Findings from seven years of field performance data

for automated demand response in commercial buildings. Lawrence

Berkeley National Laboratory, 2010.

[KJ11] J Zico Kolter and Matthew J Johnson. Redd: A public data set for

energy disaggregation research. In In Workshop on Data Mining

Applications in Sustainability (SIGKDD). ACM, 2011.

[Kot11] Dwarkadas Pralhaddas Kothari. Modern power system analysis.

Tata McGraw-Hill Education, 2011.

[KSCdPM00] Daniel S Kirschen, Goran Strbac, Pariya Cumperayot, and Dilemar

de Paiva Mendes. Factoring the elasticity of demand in electric-

ity prices. Power Systems, IEEE Transactions on, 15(2):612–617,

2000.

[KZE02] Alireza Khotanzad, Enwang Zhou, and Hassan Elragal. A neuro-

fuzzy approach to short-term load forecasting in a price-sensitive

environment. Power Systems, IEEE Transactions on, 17(4):1273–

1282, 2002.

Page 167: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

BIBLIOGRAPHY 167

[LCL11] Na Li, Lijun Chen, and Steven H. Low. Optimal demand response

based on utility maximization in power networks. In 2011 IEEE

Power and Energy Society General Meeting, pages 1–8, 2011.

[Lee90] Chuen-Chien Lee. Fuzzy logic in control systems: fuzzy logic con-

troller. ii. Systems, Man and Cybernetics, IEEE Transactions on,

20(2):419–435, 1990.

[Lju99] Lennart Ljung. System identification: theory for the user. Prentice

Hall PTR, 2 edition, 1999.

[LR93] Sushil J Louis and Gregory JE Rawlins. Predicting convergence

time for genetic algorithms. In Foundations of Genetic Algorithms

2, pages 141–161, 1993.

[LYH+14] Yi Liu, Chau Yuen, Shisheng Huang, N UL Hassan, Xiumin Wang,

and Shengli Xie. Peak-to-average ratio constrained demand-side

management with consumers preference in residential smart grid.

Selected Topics in Signal Processing, IEEE Journal of, 8(6):1084–

1097, 2014.

[MA75] Ebrahim H Mamdani and Sedrak Assilian. An experiment in lin-

guistic synthesis with a fuzzy logic controller. International journal

of man-machine studies, 7(1):1–13, 1975.

[MCWG95] Andreu Mas-Colell, Michael Dennis Whinston, and Jerry R Green.

Microeconomic theory. Oxford university press New York, 1 edi-

tion, 1995.

[Mil11] Samuel John Odell Miller. Decentralised coordination of electri-

cal generators in the smart grid using message passing. Technical

report, University of Southampton, 2011.

[MPM10] Stephen McLaughlin, Dmitry Podkuiko, and Patrick McDaniel.

Energy theft in the advanced metering infrastructure. In Criti-

cal Information Infrastructures Security, pages 176–187. Springer,

2010.

[MRLG10a] A-H Mohsenian-Rad and Alberto Leon-Garcia. Optimal residential

load control with price prediction in real-time electricity pricing

Page 168: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

BIBLIOGRAPHY 168

environments. Smart Grid, IEEE Transactions on, 1(2):120–133,

2010.

[MRLG10b] A-H Mohsenian-Rad and Alberto Leon-Garcia. Optimal residential

load control with price prediction in real-time electricity pricing

environments. Smart Grid, IEEE Transactions on, 1(2):120–133,

2010.

[MRWJ+10a] A-H Mohsenian-Rad, Vincent WS Wong, Juri Jatskevich, Robert

Schober, and Alberto Leon-Garcia. Autonomous demand-side

management based on game-theoretic energy consumption schedul-

ing for the future smart grid. Smart Grid, IEEE Transactions on,

1(3):320–331, 2010.

[MRWJ+10b] A-H Mohsenian-Rad, Vincent WS Wong, Juri Jatskevich, Robert

Schober, and Alberto Leon-Garcia. Autonomous demand-side

management based on game-theoretic energy consumption schedul-

ing for the future smart grid. Smart Grid, IEEE Transactions on,

1(3):320–331, 2010.

[MRWJS10] Amir-Hamed Mohsenian-Rad, Vincent WS Wong, Juri Jatskevich,

and Robert Schober. Optimal and autonomous incentive-based

energy consumption scheduling algorithm for smart grid. In Inno-

vative Smart Grid Technologies (ISGT), 2010, pages 1–6. IEEE,

2010.

[MSB91] Heinz Muhlenbein, M Schomisch, and Joachim Born. The par-

allel genetic algorithm as function optimizer. Parallel computing,

17(6):619–632, 1991.

[MYEH10] Amir Moshari, GR Yousefi, Akbar Ebrahimi, and Saeid Haghbin.

Demand-side behavior in the smart grid environment. In ISGT

Europe, 2010 IEEE PES, pages 1–7. IEEE, 2010.

[MZ12] Fan-Lin Meng and Xiao-Jun Zeng. A stackelberg game approach to

maximise electricity retailer’s profit and minimse customers’ bills

for future smart grid. In Computational Intelligence (UKCI), 2012

12th UK Workshop on, pages 1–7. IEEE, 2012.

Page 169: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

BIBLIOGRAPHY 169

[MZ13] Fan-Lin Meng and Xiao-Jun Zeng. A stackelberg game-theoretic

approach to optimal real-time pricing for the smart grid. Soft

Computing, 17(12):2365–2380, 2013.

[MZ14] Fan-Lin Meng and Xiao-Jun Zeng. An optimal real-time pricing for

demand-side management: A stackelberg game and genetic algo-

rithm approach. In Neural Networks (IJCNN), 2014 International

Joint Conference on, pages 1703–1710. IEEE, 2014.

[MZss] Fan-Lin Meng and Xiao-Jun Zeng. A hybrid optimization approach

to demand response management for the smart grid. IEEE Trans-

actions on Power Systems, 2015 (in Review Process).

[MZM13] Fan-Lin Meng, Xiao-Jun Zeng, and Qian Ma. Learning customer

behaviour under real-time pricing in the smart grid. In Systems,

Man, and Cybernetics (SMC), 2013 IEEE International Confer-

ence on, pages 3186–3191. IEEE, 2013.

[MZssa] Fan-Lin Meng and Xiao-Jun Zeng. Appliance level demand mod-

eling and pricing optimization for demand response management

in smart grid. In Neural Networks (IJCNN), 2015 International

Joint Conference on, 2015 (In press).

[MZssb] Fan-Lin Meng and Xiao-Jun Zeng. A profit maximization approach

to demand response management with customers behaviour learn-

ing in smart grid. IEEE Transactions on Smart Grid, 2015 (In

press).

[MZZ+13] Sabita Maharjan, Quanyan Zhu, Yan Zhang, Stein Gjessing, and

Tamer Basar. Dependable demand response management in the

smart grid: A stackelberg game approach. Smart Grid, IEEE

Transactions on, 4(1):120–132, 2013.

[oE03] U.S. Department of Energy. “Grid 2030”: A National Vision for

Electricity’s Second 100 Years. Technical report, Office of Electric

Transmission and Distribution, 2003.

[oE06] U.S. Department of Energy. Benefits of demand response in elec-

tricity markets and recommendations for achieving them: A report

Page 170: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

BIBLIOGRAPHY 170

to the united states congress pursuant to section 1253 of the energy

policy act of 2005. Technical report, U.S. Department of Energy,

February 2006.

[oEC14] Department of Energy and Climate Change.

Helping households to cut their energy bills.

https://www.gov.uk/government/policies/helping-households-

to-cut-their-energy-bills/supporting-pages/smart-meters, March

2014. Accessed 1 April 2014.

[oECC09] Department of Energy & Climate Change. Smarter grids: The

opportunity. Technical report, The Stationary Office, December

2009.

[oECC13] UK Department of Energy & Climate Change. Historical electricity

data: 1920 to 2013. Technical report, 2013.

[OPTa] GUROBI OPTIMIZATION. Mixed integer programming

basics. http://www.gurobi.com/resources/getting-started/mip-

basics. Accessed 12 February 2013.

[OPTb] TOMLAB OPTIMIZATION. Mixed-integer quadratic program-

ming. http://tomopt.com/docs/models/tomlab models007.php.

Accessed 12 February 2013.

[OSKL13] Yusuf Ozturk, Datchanamoorthy Senthilkumar, Sudhakar Kumar,

and Gene Lee. An intelligent home energy management system

to improve demand response. Smart Grid, IEEE Transactions on,

4(2):694–701, 2013.

[QZHW13] Li Ping Qian, Ying Jun Angela Zhang, Jianwei Huang, and Yuan

Wu. Demand response management via real-time electricity price

control in smart grids. Selected Areas in Communications, IEEE

Journal on, 31(7):1268–1280, 2013.

[RMG00] P. Reed, B. Minsker, and D.E. Goldberg. Designing a competent

simple genetic algorithm for search and optimization. Water Re-

sources Research, 36(12):3757–3761, 2000.

Page 171: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

BIBLIOGRAPHY 171

[RN09] Stuart Russell and Peter Norvig. Artificial intelligence: a modern

approach. Prentice Hall, 3 edition, 2009.

[Rut89] Rob Rutenbar. Simulated annealing algorithms: an overview. Cir-

cuits and Devices Magazine, IEEE, 5(1):19–26, 1989.

[SCJ73] Marwaan Simaan and Jose B Cruz Jr. On the stackelberg strat-

egy in nonzero-sum games. Journal of Optimization Theory and

Applications, 11(5):533–555, 1973.

[Ser12] Ameren Services. Real time prices.

https://www2.ameren.com/RetailEnergy/realtimeprices.aspx,

March 2012. Accessed 21 May 2012.

[Sia14] Pierluigi Siano. Demand response and smart grids – a survey.

Renewable and Sustainable Energy Reviews, 30:461–478, 2014.

[SMA15] SMARTGRID.GOV. Advanced metering infrastructure and cus-

tomer systems. https://www.smartgrid.gov/recovery act/

deployment status/ami and customer systems, February 2015.

Accessed 12 February 2015.

[SMD12] Ankur Sinha, Pekka Malo, and Kalyanmoy Deb. Unconstrained

scalable test problems for single-objective bilevel optimization. In

Evolutionary Computation (CEC), 2012 IEEE Congress on, pages

1–8. IEEE, 2012.

[SMRS+10a] Pedram Samadi, Amir-Hamed Mohsenian-Rad, Robert Schober,

Vincent W. S. Wong, and Juri Jatskevich. Optimal Real-Time

Pricing Algorithm Based on Utility Maximization for Smart Grid.

In 2010 First IEEE International Conference on Smart Grid Com-

munications, pages 415–420, 2010.

[SMRS+10b] Pedram Samadi, Amir-Hamed Mohsenian-Rad, Robert Schober,

Vincent WS Wong, and Juri Jatskevich. Optimal real-time pricing

algorithm based on utility maximization for smart grid. In Smart

Grid Communications (SmartGridComm), 2010 First IEEE Inter-

national Conference on, pages 415–420. IEEE, 2010.

Page 172: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

BIBLIOGRAPHY 172

[SS88] Torsten Soderstrom and Petre Stoica. System identification.

Prentice-Hall, Inc., 1988.

[Sys89] Gilbert Syswerda. Uniform crossover in genetic algorithms. In

Proceedings of the 3rd International Conference on Genetic Algo-

rithms, pages 2–9, 1989.

[TK13] Prakash R Thimmapuram and Jinho Kim. Consumers’ price elas-

ticity of demand modeling with economic effects on electricity mar-

kets using an agent-based model. Smart Grid, IEEE Transactions

on, 4(1):390–397, 2013.

[TMKH96] Kit-Sang Tang, KF Man, Sam Kwong, and Qun He. Genetic algo-

rithms and their applications. Signal Processing Magazine, IEEE,

13(6):22–37, 1996.

[TvS01] Theodore L. Turocy and Bernhard von Stengel. Game theory.

Technical Report LSE-CDAM-2001-09, CDAM,LSE, 2001.

[VS52] Heinrich Von Stackelberg. The theory of the market economy. Ox-

ford University Press, 1952.

[WL11] Zhimin Wang and Furong Li. Developing trend of domestic electric-

ity tariffs in great britain. In Innovative Smart Grid Technologies

(ISGT Europe), 2011 2nd IEEE PES International Conference and

Exhibition on, pages 1–5. IEEE, 2011.

[WRH99] Darrell Whitley, Soraya Rana, and Robert B Heckendorn. The is-

land model genetic algorithm: On separability, population size and

convergence. Journal of Computing and Information Technology,

7:33–48, 1999.

[YTN13] Peng Yang, Gongguo Tang, and Arye Nehorai. A game-theoretic

approach for optimal time-of-use electricity pricing. Power Sys-

tems, IEEE Transactions on, 28(2):884–892, 2013.

[ZLSS13] Zhuang Zhao, Won Cheol Lee, Yoan Shin, and Kyung-Bin Song.

An optimal power scheduling method for demand response in home

energy management system. Smart Grid, IEEE Transactions on,

4(3):1391–1400, 2013.

Page 173: A GAME-THEORETIC AND MACHINE-LEARNING APPROACH TO …

BIBLIOGRAPHY 173

[ZMPM13] Marco Zugno, Juan Miguel Morales, Pierre Pinson, and Henrik

Madsen. A bilevel model for electricity retailers’ participation in a

demand response market environment. Energy Economics, 36:182–

197, 2013.

[ZR11] Michael Zeifman and Kurt Roth. Nonintrusive appliance load mon-

itoring: Review and outlook. Consumer Electronics, IEEE Trans-

actions on, 57(1):76–84, 2011.